High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization
- URL: http://arxiv.org/abs/2211.15662v2
- Date: Tue, 29 Nov 2022 04:01:13 GMT
- Title: High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization
- Authors: Jiaxin Xie, Hao Ouyang, Jingtan Piao, Chenyang Lei, Qifeng Chen
- Abstract summary: We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
- Score: 51.878078860524795
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a high-fidelity 3D generative adversarial network (GAN) inversion
framework that can synthesize photo-realistic novel views while preserving
specific details of the input image. High-fidelity 3D GAN inversion is
inherently challenging due to the geometry-texture trade-off in 3D inversion,
where overfitting to a single view input image often damages the estimated
geometry during the latent optimization. To solve this challenge, we propose a
novel pipeline that builds on the pseudo-multi-view estimation with visibility
analysis. We keep the original textures for the visible parts and utilize
generative priors for the occluded parts. Extensive experiments show that our
approach achieves advantageous reconstruction and novel view synthesis quality
over state-of-the-art methods, even for images with out-of-distribution
textures. The proposed pipeline also enables image attribute editing with the
inverted latent code and 3D-aware texture modification. Our approach enables
high-fidelity 3D rendering from a single image, which is promising for various
applications of AI-generated 3D content.
Related papers
- Pandora3D: A Comprehensive Framework for High-Quality 3D Shape and Texture Generation [58.77520205498394]
This report presents a comprehensive framework for generating high-quality 3D shapes and textures from diverse input prompts.
The framework consists of 3D shape generation and texture generation.
This report details the system architecture, experimental results, and potential future directions to improve and expand the framework.
arXiv Detail & Related papers (2025-02-20T04:22:30Z) - F3D-Gaus: Feed-forward 3D-aware Generation on ImageNet with Cycle-Consistent Gaussian Splatting [35.625593119642424]
This paper tackles the problem of generalizable 3D-aware generation from monocular datasets.
We propose a novel feed-forward pipeline based on pixel-aligned Gaussian Splatting.
We also introduce a self-supervised cycle-consistent constraint to enforce cross-view consistency in the learned 3D representation.
arXiv Detail & Related papers (2025-01-12T04:44:44Z) - Direct and Explicit 3D Generation from a Single Image [25.207277983430608]
We introduce a novel framework to directly generate explicit surface geometry and texture using multi-view 2D depth and RGB images.
We incorporate epipolar attention into the latent-to-pixel decoder for pixel-level multi-view consistency.
By back-projecting the generated depth pixels into 3D space, we create a structured 3D representation.
arXiv Detail & Related papers (2024-11-17T03:14:50Z) - Magic-Boost: Boost 3D Generation with Multi-View Conditioned Diffusion [101.15628083270224]
We propose a novel multi-view conditioned diffusion model to synthesize high-fidelity novel view images.
We then introduce a novel iterative-update strategy to adopt it to provide precise guidance to refine the coarse generated results.
Experiments show Magic-Boost greatly enhances the coarse generated inputs, generates high-quality 3D assets with rich geometric and textural details.
arXiv Detail & Related papers (2024-04-09T16:20:03Z) - 2L3: Lifting Imperfect Generated 2D Images into Accurate 3D [16.66666619143761]
Multi-view (MV) 3D reconstruction is a promising solution to fuse generated MV images into consistent 3D objects.
However, the generated images usually suffer from inconsistent lighting, misaligned geometry, and sparse views, leading to poor reconstruction quality.
We present a novel 3D reconstruction framework that leverages intrinsic decomposition guidance, transient-mono prior guidance, and view augmentation to cope with the three issues.
arXiv Detail & Related papers (2024-01-29T02:30:31Z) - Guide3D: Create 3D Avatars from Text and Image Guidance [55.71306021041785]
Guide3D is a text-and-image-guided generative model for 3D avatar generation based on diffusion models.
Our framework produces topologically and structurally correct geometry and high-resolution textures.
arXiv Detail & Related papers (2023-08-18T17:55:47Z) - Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing.
A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing.
We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z) - 3D GAN Inversion with Facial Symmetry Prior [42.22071135018402]
It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space.
We propose a novel method to promote 3D GAN inversion by introducing facial symmetry prior.
arXiv Detail & Related papers (2022-11-30T11:57:45Z) - Multi-View Consistent Generative Adversarial Networks for 3D-aware Image
Synthesis [48.33860286920389]
3D-aware image synthesis aims to generate images of objects from multiple views by learning a 3D representation.
Existing approaches lack geometry constraints, hence usually fail to generate multi-view consistent images.
We propose Multi-View Consistent Generative Adrial Networks (MVCGAN) for high-quality 3D-aware image synthesis with geometry constraints.
arXiv Detail & Related papers (2022-04-13T11:23:09Z) - Fast-GANFIT: Generative Adversarial Network for High Fidelity 3D Face
Reconstruction [76.1612334630256]
We harness the power of Generative Adversarial Networks (GANs) and Deep Convolutional Neural Networks (DCNNs) to reconstruct the facial texture and shape from single images.
We demonstrate excellent results in photorealistic and identity preserving 3D face reconstructions and achieve for the first time, facial texture reconstruction with high-frequency details.
arXiv Detail & Related papers (2021-05-16T16:35:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.