Designing a 3D-Aware StyleNeRF Encoder for Face Editing
- URL: http://arxiv.org/abs/2302.09467v1
- Date: Sun, 19 Feb 2023 03:32:28 GMT
- Title: Designing a 3D-Aware StyleNeRF Encoder for Face Editing
- Authors: Songlin Yang, Wei Wang, Bo Peng, Jing Dong
- Abstract summary: We propose a 3D-aware encoder for GAN inversion and face editing based on the powerful StyleNeRF model.
Our proposed 3Da encoder combines a parametric 3D face model with a learnable detail representation model to generate geometry, texture and view direction codes.
- Score: 15.303426697795143
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: GAN inversion has been exploited in many face manipulation tasks, but 2D GANs
often fail to generate multi-view 3D consistent images. The encoders designed
for 2D GANs are not able to provide sufficient 3D information for the inversion
and editing. Therefore, 3D-aware GAN inversion is proposed to increase the 3D
editing capability of GANs. However, the 3D-aware GAN inversion remains
under-explored. To tackle this problem, we propose a 3D-aware (3Da) encoder for
GAN inversion and face editing based on the powerful StyleNeRF model. Our
proposed 3Da encoder combines a parametric 3D face model with a learnable
detail representation model to generate geometry, texture and view direction
codes. For more flexible face manipulation, we then design a dual-branch
StyleFlow module to transfer the StyleNeRF codes with disentangled geometry and
texture flows. Extensive experiments demonstrate that we realize 3D consistent
face manipulation in both facial attribute editing and texture transfer.
Furthermore, for video editing, we make the sequence of frame codes share a
common canonical manifold, which improves the temporal consistency of the
edited attributes.
Related papers
- DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation [57.406031264184584]
DragGaussian is a 3D object drag-editing framework based on 3D Gaussian Splatting.
Our contributions include the introduction of a new task, the development of DragGaussian for interactive point-based 3D editing, and comprehensive validation of its effectiveness through qualitative and quantitative experiments.
arXiv Detail & Related papers (2024-05-09T14:34:05Z) - Make Encoder Great Again in 3D GAN Inversion through Geometry and
Occlusion-Aware Encoding [25.86312557482366]
3D GAN inversion aims to achieve high reconstruction fidelity and reasonable 3D geometry simultaneously from a single image input.
We introduce a novel encoder-based inversion framework based on EG3D, one of the most widely-used 3D GAN models.
Our method achieves impressive results comparable to optimization-based methods while operating up to 500 times faster.
arXiv Detail & Related papers (2023-03-22T05:51:53Z) - CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [49.281006972028194]
We introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts.
Our evaluations on synthetic 3D-FRONT and real-world KITTI-360 datasets demonstrate that our model generates scenes of improved visual and geometric quality.
arXiv Detail & Related papers (2023-03-21T17:59:02Z) - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing.
A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing.
We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z) - LatentSwap3D: Semantic Edits on 3D Image GANs [48.1271011449138]
3D GANs can generate latent codes for entire 3D volumes rather than only 2D images.
LatentSwap3D is a semantic edit approach based on latent space discovery.
We show results on seven 3D GANs and on five datasets.
arXiv Detail & Related papers (2022-12-02T18:59:51Z) - 3D GAN Inversion with Pose Optimization [26.140281977885376]
We introduce a generalizable 3D GAN inversion method that infers camera viewpoint and latent code simultaneously to enable multi-view consistent semantic image editing.
We conduct extensive experiments on image reconstruction and editing both quantitatively and qualitatively, and further compare our results with 2D GAN-based editing.
arXiv Detail & Related papers (2022-10-13T19:06:58Z) - XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space.
The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing.
We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z) - Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator [68.0533826852601]
3D-aware image synthesis aims at learning a generative model that can render photo-realistic 2D images while capturing decent underlying 3D shapes.
Existing methods fail to obtain moderate 3D shapes.
We propose a geometry-aware discriminator to improve 3D-aware GANs.
arXiv Detail & Related papers (2022-09-30T17:59:37Z) - 3D-FM GAN: Towards 3D-Controllable Face Manipulation [43.99393180444706]
3D-FM GAN is a novel conditional GAN framework designed specifically for 3D-controllable face manipulation.
By carefully encoding both the input face image and a physically-based rendering of 3D edits into a StyleGAN's latent spaces, our image generator provides high-quality, identity-preserved, 3D-controllable face manipulation.
We show that our method outperforms the prior arts on various tasks, with better editability, stronger identity preservation, and higher photo-realism.
arXiv Detail & Related papers (2022-08-24T01:33:13Z) - Lifting 2D StyleGAN for 3D-Aware Face Generation [52.8152883980813]
We propose a framework, called LiftedGAN, that disentangles and lifts a pre-trained StyleGAN2 for 3D-aware face generation.
Our model is "3D-aware" in the sense that it is able to (1) disentangle the latent space of StyleGAN2 into texture, shape, viewpoint, lighting and (2) generate 3D components for synthetic images.
arXiv Detail & Related papers (2020-11-26T05:02:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.