StylePeople StylePeople StylePeople
A Generative Model of Fullbody Human Avatars A Generative Model of Fullbody Human Avatars A Generative Model of Fullbody Human Avatars
 
 
 
Artur Grigorev1,2* Artur Grigorev1,2* Artur Grigorev1,2*
Karim Iskakov1* Karim Iskakov1* Karim Iskakov1*
Anastasia Ianina1 Anastasia Ianina1 Anastasia Ianina1,2
Renat Bashirov1 Renat Bashirov1 Renat Bashirov1,2
 
 
Ilya Zakharkin1,2 Ilya Zakharkin1,2 Ilya Zakharkin1,2
Alexander Vakhitov1 Alexander Vakhitov1 Alexander Vakhitov1
Victor Lempitsky1,2 Victor Lempitsky1,2 Victor Lempitsky1,2
 
 
 
Samsung AI Center Moscow1 Samsung AI Center Moscow1 Samsung AI Center Moscow1
Skolkovo Institute of Science and Technology2 Skolkovo Institute of Science and Technology2 Skolkovo Institute of Science and Technology2
 
 
 
 
 
 
Figure 1: Style people, i.e. random samples from our generative models of human avatars. Each avatar is shown from two different viewpoints. The samples show diversity in terms of clothing and demographics. Loose clothing and hair are present.
 
 
 
 
Code
 
 
This project consists of two parts. The first part, named Neural Textures shows textures optimized over a video of single person via backpropogation. The second one, StylePeole presents a generative model able to sample random neural textures as well as optimize latent code in one-shot or few-shot mode.
 
 
 
Abstract
 
 
We propose a new type of full-body human avatars, which combines parametric mesh-based body model with a neural texture. We show that with the help of neural textures, such avatars can successfully model clothing and hair, which usually poses a problem for mesh-based approaches. We also show how these avatars can be created from multiple frames of a video using backpropagation. We then propose a generative model for such avatars that can be trained from datasets of images and videos of people. The generative model allows us to sample random avatars as well as to create dressed avatars of people from one or few images.
 
 
 
 
 
 
Figure 2: Our generative architecture is based on the combination of StyleGANv2 and Neural dressing. The StyleGAN part is used to generate neural textures, which are then superimposed on SMPL-X meshes and rendered with a neural renderer. During adversarial learning, the discriminator considers a pair of images of the same person.
 
 
Data
 
 
For this project we used two novel datasets: AzurePeople and TEDXPeople. You may find links to the pages of both datasets below:
 
Video