MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using
Differentiable Shading
- URL: http://arxiv.org/abs/2312.13091v2
- Date: Fri, 22 Dec 2023 02:06:32 GMT
- Title: MoSAR: Monocular Semi-Supervised Model for Avatar Reconstruction using
Differentiable Shading
- Authors: Abdallah Dib, Luiz Gustavo Hafemann, Emeline Got, Trevor Anderson,
Amin Fadaeinejad, Rafael M. O. Cruz, Marc-Andre Carbonneau
- Abstract summary: MoSAR is a method for 3D avatar generation from monocular images.
We propose a semi-supervised training scheme that improves generalization by learning from both light stage and in-the-wild datasets.
We also introduce a new dataset, named FFHQ-UV-Intrinsics, the first public dataset providing intrinsic face attributes at scale.
- Score: 3.2586340344073927
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reconstructing an avatar from a portrait image has many applications in
multimedia, but remains a challenging research problem. Extracting reflectance
maps and geometry from one image is ill-posed: recovering geometry is a
one-to-many mapping problem and reflectance and light are difficult to
disentangle. Accurate geometry and reflectance can be captured under the
controlled conditions of a light stage, but it is costly to acquire large
datasets in this fashion. Moreover, training solely with this type of data
leads to poor generalization with in-the-wild images. This motivates the
introduction of MoSAR, a method for 3D avatar generation from monocular images.
We propose a semi-supervised training scheme that improves generalization by
learning from both light stage and in-the-wild datasets. This is achieved using
a novel differentiable shading formulation. We show that our approach
effectively disentangles the intrinsic face parameters, producing relightable
avatars. As a result, MoSAR estimates a richer set of skin reflectance maps,
and generates more realistic avatars than existing state-of-the-art methods. We
also introduce a new dataset, named FFHQ-UV-Intrinsics, the first public
dataset providing intrinsic face attributes at scale (diffuse, specular,
ambient occlusion and translucency maps) for a total of 10k subjects. The
project website and the dataset are available on the following link:
https://ubisoft-laforge.github.io/character/mosar/
Related papers
- HAVE-FUN: Human Avatar Reconstruction from Few-Shot Unconstrained Images [33.298962236215964]
We study the reconstruction of human avatars from a few-shot unconstrained photo album.
For handling dynamic data, we integrate a skinning mechanism with deep marching tetrahedra.
Our framework, called HaveFun, can undertake avatar reconstruction, rendering, and animation.
arXiv Detail & Related papers (2023-11-27T10:01:31Z) - Relightable and Animatable Neural Avatar from Sparse-View Video [66.77811288144156]
This paper tackles the challenge of creating relightable and animatable neural avatars from sparse-view (or even monocular) videos of dynamic humans under unknown illumination.
arXiv Detail & Related papers (2023-08-15T17:42:39Z) - DiFaReli: Diffusion Face Relighting [13.000032155650835]
We present a novel approach to single-view face relighting in the wild.
Handling non-diffuse effects, such as global illumination or cast shadows, has long been a challenge in face relighting.
We achieve state-of-the-art performance on standard benchmark Multi-PIE and can photorealistically relight in-the-wild images.
arXiv Detail & Related papers (2023-04-19T08:03:20Z) - HQ3DAvatar: High Quality Controllable 3D Head Avatar [65.70885416855782]
This paper presents a novel approach to building highly photorealistic digital head avatars.
Our method learns a canonical space via an implicit function parameterized by a neural network.
At test time, our method is driven by a monocular RGB video.
arXiv Detail & Related papers (2023-03-25T13:56:33Z) - Learning a 3D Morphable Face Reflectance Model from Low-cost Data [21.37535100469443]
Existing works build parametric models for diffuse and specular albedo using Light Stage data.
This paper proposes the first 3D morphable face reflectance model with spatially varying BRDF using only low-cost publicly-available data.
arXiv Detail & Related papers (2023-03-21T09:08:30Z) - RANA: Relightable Articulated Neural Avatars [83.60081895984634]
We propose RANA, a relightable and articulated neural avatar for the photorealistic synthesis of humans.
We present a novel framework to model humans while disentangling their geometry, texture, and also lighting environment from monocular RGB videos.
arXiv Detail & Related papers (2022-12-06T18:59:31Z) - One-shot Implicit Animatable Avatars with Model-based Priors [31.385051428938585]
ELICIT is a novel method for learning human-specific neural radiance fields from a single image.
ELICIT has outperformed strong baseline methods of avatar creation when only a single image is available.
arXiv Detail & Related papers (2022-12-05T18:24:06Z) - Shape, Pose, and Appearance from a Single Image via Bootstrapped
Radiance Field Inversion [54.151979979158085]
We introduce a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available.
We leverage an unconditional 3D-aware generator, to which we apply a hybrid inversion scheme where a model produces a first guess of the solution.
Our framework can de-render an image in as few as 10 steps, enabling its use in practical scenarios.
arXiv Detail & Related papers (2022-11-21T17:42:42Z) - DRaCoN -- Differentiable Rasterization Conditioned Neural Radiance
Fields for Articulated Avatars [92.37436369781692]
We present DRaCoN, a framework for learning full-body volumetric avatars.
It exploits the advantages of both the 2D and 3D neural rendering techniques.
Experiments on the challenging ZJU-MoCap and Human3.6M datasets indicate that DRaCoN outperforms state-of-the-art methods.
arXiv Detail & Related papers (2022-03-29T17:59:15Z) - ERF: Explicit Radiance Field Reconstruction From Scratch [12.254150867994163]
We propose a novel explicit dense 3D reconstruction approach that processes a set of images of a scene with sensor poses and calibrations and estimates a photo-real digital model.
One of the key innovations is that the underlying volumetric representation is completely explicit.
We show that our method is general and practical. It does not require a highly controlled lab setup for capturing, but allows for reconstructing scenes with a vast variety of objects.
arXiv Detail & Related papers (2022-02-28T19:37:12Z) - I M Avatar: Implicit Morphable Head Avatars from Videos [68.13409777995392]
We propose IMavatar, a novel method for learning implicit head avatars from monocular videos.
Inspired by the fine-grained control mechanisms afforded by conventional 3DMMs, we represent the expression- and pose-related deformations via learned blendshapes and skinning fields.
We show quantitatively and qualitatively that our method improves geometry and covers a more complete expression space compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-12-14T15:30:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.