We propose a neural point-based graphics system for creating a relightable 3D portrait of a human head. The method requires a video of a person made by a smartphone camera with the flash blinking at a certain frequency. By Structure-from-Motion (SfM) software, a camera pose is estimated for each of the frames extracted from the video, and a dense point cloud is reconstructed afterwards. The point cloud is then filtered by a deep segmentation network. After training, our neural pipeline receives the point cloud, rasterized onto a novel camera view, and predicts several light-invariant feature maps, which, given novel lighting conditions, are fused into the relighted image. New lighting can either be simple (e.g. directional or point light) or complex (e.g. environment map).
Given two types of input frames (flash and no-flash), the neural pipeline is fitted to both sequences by decomposing each image into albedo, normals, and room shadows. The plausibility of decomposition is enforced by specific facial priors (see Video for more detail).
In the paper, we showcase several head portraits created for smartphone videos, as well as quantitative and qualitative comparison on synthetic people. The performance of the method is evaluated under varying lighting conditions and at the extrapolated viewpoints and compared with existing relighting methods.
@article{sevastopolsky2020relightable, title={Relightable 3D Head Portraits from a Smartphone Video}, author={Sevastopolsky, Artem and Ignatiev, Savva and Ferrer, Gonzalo and Burnaev, Evgeny and Lempitsky, Victor}, year={2020}, eprint={arXiv preprint arXiv:2012.09963}, archivePrefix={arXiv}, primaryClass={cs.CV} }