FED-NeRF: Achieve High 3D Consistency and Temporal Coherence for Face
Video Editing on Dynamic NeRF
- URL: http://arxiv.org/abs/2401.02616v1
- Date: Fri, 5 Jan 2024 03:23:38 GMT
- Title: FED-NeRF: Achieve High 3D Consistency and Temporal Coherence for Face
Video Editing on Dynamic NeRF
- Authors: Hao Zhang, Yu-Wing Tai, Chi-Keung Tang
- Abstract summary: This paper proposes a novel face video editing architecture built upon the dynamic face GAN-NeRF structure.
By editing the latent code, multi-view consistent editing on the face can be ensured, as validated by multiview stereo reconstruction.
We propose a stabilizer that maintains temporal coherence by preserving smooth changes of face expressions in consecutive frames.
- Score: 77.94545888842883
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The success of the GAN-NeRF structure has enabled face editing on NeRF to
maintain 3D view consistency. However, achieving simultaneously multi-view
consistency and temporal coherence while editing video sequences remains a
formidable challenge. This paper proposes a novel face video editing
architecture built upon the dynamic face GAN-NeRF structure, which effectively
utilizes video sequences to restore the latent code and 3D face geometry. By
editing the latent code, multi-view consistent editing on the face can be
ensured, as validated by multiview stereo reconstruction on the resulting
edited images in our dynamic NeRF. As the estimation of face geometries occurs
on a frame-by-frame basis, this may introduce a jittering issue. We propose a
stabilizer that maintains temporal coherence by preserving smooth changes of
face expressions in consecutive frames. Quantitative and qualitative analyses
reveal that our method, as the pioneering 4D face video editor, achieves
state-of-the-art performance in comparison to existing 2D or 3D-based
approaches independently addressing identity and motion. Codes will be
released.
Related papers
- SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing [58.22339174221563]
We propose SyncNoise, a novel geometry-guided multi-view consistent noise editing approach for high-fidelity 3D scene editing.
SyncNoise synchronously edits multiple views with 2D diffusion models while enforcing multi-view noise predictions to be geometrically consistent.
Our method achieves high-quality 3D editing results respecting the textual instructions, especially in scenes with complex textures.
arXiv Detail & Related papers (2024-06-25T09:17:35Z) - FaceDNeRF: Semantics-Driven Face Reconstruction, Prompt Editing and
Relighting with Diffusion Models [67.17713009917095]
We propose Face Diffusion NeRF (FaceDNeRF), a new generative method to reconstruct high-quality Face NeRFs from single images.
With carefully designed illumination and identity preserving loss, FaceDNeRF offers users unparalleled control over the editing process.
arXiv Detail & Related papers (2023-06-01T15:14:39Z) - IDE-3D: Interactive Disentangled Editing for High-Resolution 3D-aware
Portrait Synthesis [38.517819699560945]
Our system consists of three major components: (1) a 3D-semantics-aware generative model that produces view-consistent, disentangled face images and semantic masks; (2) a hybrid GAN inversion approach that initializes the latent codes from the semantic and texture encoder, and further optimized them for faithful reconstruction; and (3) a canonical editor that enables efficient manipulation of semantic masks in canonical view and product high-quality editing results.
arXiv Detail & Related papers (2022-05-31T03:35:44Z) - UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video
Editing [78.26925404508994]
We propose a unified temporally consistent facial video editing framework termed UniFaceGAN.
Our framework is designed to handle face swapping and face reenactment simultaneously.
Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
arXiv Detail & Related papers (2021-08-12T10:35:22Z) - Online Adaptation for Consistent Mesh Reconstruction in the Wild [147.22708151409765]
We pose video-based reconstruction as a self-supervised online adaptation problem applied to any incoming test video.
We demonstrate that our algorithm recovers temporally consistent and reliable 3D structures from videos of non-rigid objects including those of animals captured in the wild.
arXiv Detail & Related papers (2020-12-06T07:22:27Z) - Task-agnostic Temporally Consistent Facial Video Editing [84.62351915301795]
We propose a task-agnostic, temporally consistent facial video editing framework.
Based on a 3D reconstruction model, our framework is designed to handle several editing tasks in a more unified and disentangled manner.
Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
arXiv Detail & Related papers (2020-07-03T02:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.