Related papers: WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting

WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting

URL: http://arxiv.org/abs/2511.08178v1
Date: Wed, 12 Nov 2025 01:44:59 GMT
Title: WarpGAN: Warping-Guided 3D GAN Inversion with Style-Based Novel View Inpainting
Authors: Kaitao Huang, Yan Yan, Jing-Hao Xue, Hanzi Wang,
Abstract summary: 3D GAN inversion projects a single image into the latent space of a pre-trained 3D GAN to achieve single-shot novel view synthesis.<n>We introduce the warping-and-inpainting strategy to incorporate image inpainting into 3D GAN inversion and propose a novel 3D GAN inversion method, WarpGAN.
Score: 68.77882703764142
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: 3D GAN inversion projects a single image into the latent space of a pre-trained 3D GAN to achieve single-shot novel view synthesis, which requires visible regions with high fidelity and occluded regions with realism and multi-view consistency. However, existing methods focus on the reconstruction of visible regions, while the generation of occluded regions relies only on the generative prior of 3D GAN. As a result, the generated occluded regions often exhibit poor quality due to the information loss caused by the low bit-rate latent code. To address this, we introduce the warping-and-inpainting strategy to incorporate image inpainting into 3D GAN inversion and propose a novel 3D GAN inversion method, WarpGAN. Specifically, we first employ a 3D GAN inversion encoder to project the single-view image into a latent code that serves as the input to 3D GAN. Then, we perform warping to a novel view using the depth map generated by 3D GAN. Finally, we develop a novel SVINet, which leverages the symmetry prior and multi-view image correspondence w.r.t. the same latent code to perform inpainting of occluded regions in the warped image. Quantitative and qualitative experiments demonstrate that our method consistently outperforms several state-of-the-art methods.

Related papers

Wonder3D++: Cross-domain Diffusion for High-fidelity 3D Generation from a Single Image [68.55613894952177]
We introduce textbfWonder3D++, a novel method for efficiently generating high-fidelity textured meshes from single-view images.<n>We propose a cross-domain diffusion model that generates multi-view normal maps and the corresponding color images.<n> Lastly, we introduce a cascaded 3D mesh extraction algorithm that drives high-quality surfaces from the multi-view 2D representations in only about $3$ minute in a coarse-to-fine manner.
arXiv Detail & Related papers (2025-11-03T17:24:18Z)
DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting [10.515239541326737]
Single reference inpainting methods lack robustness when dealing with views far from the reference view.<n>Appearance inconsistency arises when independently inpainting multi-view images with 2D diffusion priors.<n>DiGA3D uses diffusion models to propagate consistent appearance and geometry in a coarse-to-fine manner.
arXiv Detail & Related papers (2025-07-01T04:57:08Z)
Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing. A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing. We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z)
3D GAN Inversion with Facial Symmetry Prior [42.22071135018402]
It is natural to associate 3D GANs with GAN inversion methods to project a real image into the generator's latent space. We propose a novel method to promote 3D GAN inversion by introducing facial symmetry prior.
arXiv Detail & Related papers (2022-11-30T11:57:45Z)
High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views. Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z)
3D GAN Inversion with Pose Optimization [26.140281977885376]
We introduce a generalizable 3D GAN inversion method that infers camera viewpoint and latent code simultaneously to enable multi-view consistent semantic image editing. We conduct extensive experiments on image reconstruction and editing both quantitatively and qualitatively, and further compare our results with 2D GAN-based editing.
arXiv Detail & Related papers (2022-10-13T19:06:58Z)
2D GANs Meet Unsupervised Single-view 3D Reconstruction [21.93671761497348]
controllable image generation based on pre-trained GANs can benefit a wide range of computer vision tasks. We propose a novel image-conditioned neural implicit field, which can leverage 2D supervisions from GAN-generated multi-view images. The effectiveness of our approach is demonstrated through superior single-view 3D reconstruction results of generic objects.
arXiv Detail & Related papers (2022-07-20T20:24:07Z)
Vision Transformer for NeRF-Based View Synthesis from a Single Input Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation. To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering. Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z)
3D Photography using Context-aware Layered Depth Inpainting [50.66235795163143]
We propose a method for converting a single RGB-D input image into a 3D photo. A learning-based inpainting model synthesizes new local color-and-depth content into the occluded region. The resulting 3D photos can be efficiently rendered with motion parallax.
arXiv Detail & Related papers (2020-04-09T17:59:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.