Relightable 3D Head Portraits from a Smartphone Video

Abstract

We propose a neural point-based graphics system for creating a relightable 3D portrait of a human head. The method requires a video of a person made by a smartphone camera with the flash blinking at a certain frequency. By Structure-from-Motion (SfM) software, a camera pose is estimated for each of the frames extracted from the video, and a dense point cloud is reconstructed afterwards. The point cloud is then filtered by a deep segmentation network. After training, our neural pipeline receives the point cloud, rasterized onto a novel camera view, and predicts several light-invariant feature maps, which, given novel lighting conditions, are fused into the relighted image. New lighting can either be simple (e.g. directional or point light) or complex (e.g. environment map).

Given two types of input frames (flash and no-flash), the neural pipeline is fitted to both sequences by decomposing each image into albedo, normals, and room shadows. The plausibility of decomposition is enforced by specific facial priors (see Video for more detail).

In the paper, we showcase several head portraits created for smartphone videos, as well as quantitative and qualitative comparison on synthetic people. The performance of the method is evaluated under varying lighting conditions and at the extrapolated viewpoints and compared with existing relighting methods.

Citation

BibTeX:

@article{sevastopolsky2020relightable,
  title={Relightable 3D Head Portraits from a Smartphone Video},
  author={Sevastopolsky, Artem and Ignatiev, Savva and 
          Ferrer, Gonzalo and Burnaev, Evgeny and 
          Lempitsky, Victor},
  year={2020},
  eprint={arXiv preprint arXiv:2012.09963},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

The code for the video slider was borrowed from the exciting project page of Deformable Neural Radiance Fields.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC-BY-NC).