Video-driven Neural Physically-based Facial Asset for Production
- URL: http://arxiv.org/abs/2202.05592v2
- Date: Mon, 14 Feb 2022 07:01:55 GMT
- Title: Video-driven Neural Physically-based Facial Asset for Production
- Authors: Longwen Zhang, Chuxiao Zeng, Qixuan Zhang, Hongyang Lin, Ruixiang Cao,
Wei Yang, Lan Xu, and Jingyi Yu
- Abstract summary: We present a new learning-based, video-driven approach for generating dynamic facial geometries with high-quality physically-based assets.
Our technique provides higher accuracy and visual fidelity than previous video-driven facial reconstruction and animation methods.
- Score: 33.24654834163312
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Production-level workflows for producing convincing 3D dynamic human faces
have long relied on a disarray of labor-intensive tools for geometry and
texture generation, motion capture and rigging, and expression synthesis.
Recent neural approaches automate individual components but the corresponding
latent representations cannot provide artists explicit controls as in
conventional tools. In this paper, we present a new learning-based,
video-driven approach for generating dynamic facial geometries with
high-quality physically-based assets. Two key components are well-structured
latent spaces due to dense temporal samplings from videos and explicit facial
expression controls to regulate the latent spaces. For data collection, we
construct a hybrid multiview-photometric capture stage, coupling with an
ultra-fast video camera to obtain raw 3D facial assets. We then model the
facial expression, geometry and physically-based textures using separate VAEs
with a global MLP-based expression mapping across the latent spaces, to
preserve characteristics across respective attributes while maintaining
explicit controls over geometry and texture. We also introduce to model the
delta information as wrinkle maps for physically-base textures, achieving
high-quality rendering of dynamic textures. We demonstrate our approach in
high-fidelity performer-specific facial capture and cross-identity facial
motion retargeting. In addition, our neural asset along with fast adaptation
schemes can also be deployed to handle in-the-wild videos. Besides, we motivate
the utility of our explicit facial disentangle strategy by providing promising
physically-based editing results like geometry and material editing or winkle
transfer with high realism. Comprehensive experiments show that our technique
provides higher accuracy and visual fidelity than previous video-driven facial
reconstruction and animation methods.
Related papers
- DreamPolish: Domain Score Distillation With Progressive Geometry Generation [66.94803919328815]
We introduce DreamPolish, a text-to-3D generation model that excels in producing refined geometry and high-quality textures.
In the geometry construction phase, our approach leverages multiple neural representations to enhance the stability of the synthesis process.
In the texture generation phase, we introduce a novel score distillation objective, namely domain score distillation (DSD), to guide neural representations toward such a domain.
arXiv Detail & Related papers (2024-11-03T15:15:01Z) - G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation [79.99551055245071]
We propose VividPose, an end-to-end pipeline that ensures superior temporal stability.
An identity-aware appearance controller integrates additional facial information without compromising other appearance details.
A geometry-aware pose controller utilizes both dense rendering maps from SMPL-X and sparse skeleton maps.
VividPose exhibits superior generalization capabilities on our proposed in-the-wild dataset.
arXiv Detail & Related papers (2024-05-28T13:18:32Z) - FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic Faces [21.946327323788275]
3D rendering of dynamic face is a challenging problem.
We present a novel representation that enables high-quality rendering of an actor's dynamic facial performances.
arXiv Detail & Related papers (2024-04-22T00:44:13Z) - StyleFaceV: Face Video Generation via Decomposing and Recomposing
Pretrained StyleGAN3 [43.43545400625567]
We propose a principled framework named StyleFaceV, which produces high-fidelity identity-preserving face videos with vivid movements.
Our core insight is to decompose appearance and pose information and recompose them in the latent space of StyleGAN3 to produce stable and dynamic results.
arXiv Detail & Related papers (2022-08-16T17:47:03Z) - Neural Actor: Neural Free-view Synthesis of Human Actors with Pose
Control [80.79820002330457]
We propose a new method for high-quality synthesis of humans from arbitrary viewpoints and under arbitrary controllable poses.
Our method achieves better quality than the state-of-the-arts on playback as well as novel pose synthesis, and can even generalize well to new poses that starkly differ from the training poses.
arXiv Detail & Related papers (2021-06-03T17:40:48Z) - Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images.
Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z) - Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face
Reconstruction [76.1612334630256]
We harness the power of Generative Adversarial Networks (GANs) and Deep Convolutional Neural Networks (DCNNs) to reconstruct the facial texture and shape from single images.
We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, facial texture reconstruction with high-frequency details.
arXiv Detail & Related papers (2021-05-16T16:35:44Z) - Real-time Deep Dynamic Characters [95.5592405831368]
We propose a deep videorealistic 3D human character model displaying highly realistic shape, motion, and dynamic appearance.
We use a novel graph convolutional network architecture to enable motion-dependent deformation learning of body and clothing.
We show that our model creates motion-dependent surface deformations, physically plausible dynamic clothing deformations, as well as video-realistic surface textures at a much higher level of detail than previous state of the art approaches.
arXiv Detail & Related papers (2021-05-04T23:28:55Z) - Dynamic Facial Asset and Rig Generation from a Single Scan [17.202189917030033]
We propose a framework for the automatic generation of high-quality dynamic facial assets.
Our framework takes a single scan as input to generate a set of personalized blendshapes, dynamic and physically-based textures, as well as secondary facial components.
arXiv Detail & Related papers (2020-10-01T17:25:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.