StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image
Synthesis
- URL: http://arxiv.org/abs/2110.08985v1
- Date: Mon, 18 Oct 2021 02:37:01 GMT
- Title: StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image
Synthesis
- Authors: Jiatao Gu, Lingjie Liu, Peng Wang and Christian Theobalt
- Abstract summary: StyleNeRF is a 3D-aware generative model for high-resolution image synthesis with high multi-view consistency.
It integrates the neural radiance field (NeRF) into a style-based generator.
It can synthesize high-resolution images at interactive rates while preserving 3D consistency at high quality.
- Score: 92.25145204543904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose StyleNeRF, a 3D-aware generative model for photo-realistic
high-resolution image synthesis with high multi-view consistency, which can be
trained on unstructured 2D images. Existing approaches either cannot synthesize
high-resolution images with fine details or yield noticeable 3D-inconsistent
artifacts. In addition, many of them lack control over style attributes and
explicit 3D camera poses. StyleNeRF integrates the neural radiance field (NeRF)
into a style-based generator to tackle the aforementioned challenges, i.e.,
improving rendering efficiency and 3D consistency for high-resolution image
generation. We perform volume rendering only to produce a low-resolution
feature map and progressively apply upsampling in 2D to address the first
issue. To mitigate the inconsistencies caused by 2D upsampling, we propose
multiple designs, including a better upsampler and a new regularization loss.
With these designs, StyleNeRF can synthesize high-resolution images at
interactive rates while preserving 3D consistency at high quality. StyleNeRF
also enables control of camera poses and different levels of styles, which can
generalize to unseen views. It also supports challenging tasks, including
zoom-in and-out, style mixing, inversion, and semantic editing.
Related papers
- SuperNeRF-GAN: A Universal 3D-Consistent Super-Resolution Framework for Efficient and Enhanced 3D-Aware Image Synthesis [59.73403876485574]
We propose SuperNeRF-GAN, a universal framework for 3D-consistent super-resolution.
A key highlight of SuperNeRF-GAN is its seamless integration with NeRF-based 3D-aware image synthesis methods.
Experimental results demonstrate the superior efficiency, 3D-consistency, and quality of our approach.
arXiv Detail & Related papers (2025-01-12T10:31:33Z) - LiftImage3D: Lifting Any Single Image to 3D Gaussians with Video Generation Priors [107.83398512719981]
Single-image 3D reconstruction remains a fundamental challenge in computer vision.
Recent advances in Latent Video Diffusion Models offer promising 3D priors learned from large-scale video data.
We propose LiftImage3D, a framework that effectively releases LVDMs' generative priors while ensuring 3D consistency.
arXiv Detail & Related papers (2024-12-12T18:58:42Z) - Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild.
Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture.
We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z) - SyncNoise: Geometrically Consistent Noise Prediction for Text-based 3D Scene Editing [58.22339174221563]
We propose SyncNoise, a novel geometry-guided multi-view consistent noise editing approach for high-fidelity 3D scene editing.
SyncNoise synchronously edits multiple views with 2D diffusion models while enforcing multi-view noise predictions to be geometrically consistent.
Our method achieves high-quality 3D editing results respecting the textual instructions, especially in scenes with complex textures.
arXiv Detail & Related papers (2024-06-25T09:17:35Z) - Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields [29.573344213110172]
We propose a framework called Omni-Recon, which is capable of (1) generalizable 3D reconstruction and zero-shot multitask scene understanding, and (2) adaptability to diverse downstream 3D applications such as real-time rendering and scene editing.
Specifically, our Omni-Recon features a general-purpose NeRF model using image-based rendering with two decoupled branches.
This design achieves state-of-the-art (SOTA) generalizable 3D surface reconstruction quality with blending weights reusable across diverse tasks for zero-shot multitask scene understanding.
arXiv Detail & Related papers (2024-03-17T07:47:26Z) - What You See is What You GAN: Rendering Every Pixel for High-Fidelity
Geometry in 3D GANs [82.3936309001633]
3D-aware Generative Adversarial Networks (GANs) have shown remarkable progress in learning to generate multi-view-consistent images and 3D geometries.
Yet, the significant memory and computational costs of dense sampling in volume rendering have forced 3D GANs to adopt patch-based training or employ low-resolution rendering with post-processing 2D super resolution.
We propose techniques to scale neural volume rendering to the much higher resolution of native 2D images, thereby resolving fine-grained 3D geometry with unprecedented detail.
arXiv Detail & Related papers (2024-01-04T18:50:38Z) - 3D-aware Image Synthesis via Learning Structural and Textural
Representations [39.681030539374994]
We propose VolumeGAN, for high-fidelity 3D-aware image synthesis, through explicitly learning a structural representation and a textural representation.
Our approach achieves sufficiently higher image quality and better 3D control than the previous methods.
arXiv Detail & Related papers (2021-12-20T18:59:40Z) - Efficient Geometry-aware 3D Generative Adversarial Networks [50.68436093869381]
Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent.
In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations.
We introduce an expressive hybrid explicit-implicit network architecture that synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry.
arXiv Detail & Related papers (2021-12-15T08:01:43Z) - GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis [43.4859484191223]
We propose a generative model for radiance fields which have recently proven successful for novel view synthesis of a single scene.
By introducing a multi-scale patch-based discriminator, we demonstrate synthesis of high-resolution images while training our model from unposed 2D images alone.
arXiv Detail & Related papers (2020-07-05T20:37:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.