Related papers: What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs

What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs

URL: http://arxiv.org/abs/2401.02411v1
Date: Thu, 4 Jan 2024 18:50:38 GMT
Title: What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs
Authors: Alex Trevithick, Matthew Chan, Towaki Takikawa, Umar Iqbal, Shalini De Mello, Manmohan Chandraker, Ravi Ramamoorthi, Koki Nagano
Abstract summary: 3D-aware Generative Adversarial Networks (GANs) have shown remarkable progress in learning to generate multi-view-consistent images and 3D geometries. Yet, the significant memory and computational costs of dense sampling in volume rendering have forced 3D GANs to adopt patch-based training or employ low-resolution rendering with post-processing 2D super resolution. We propose techniques to scale neural volume rendering to the much higher resolution of native 2D images, thereby resolving fine-grained 3D geometry with unprecedented detail.
Score: 82.3936309001633
License: http://creativecommons.org/licenses/by/4.0/
Abstract: 3D-aware Generative Adversarial Networks (GANs) have shown remarkable progress in learning to generate multi-view-consistent images and 3D geometries of scenes from collections of 2D images via neural volume rendering. Yet, the significant memory and computational costs of dense sampling in volume rendering have forced 3D GANs to adopt patch-based training or employ low-resolution rendering with post-processing 2D super resolution, which sacrifices multiview consistency and the quality of resolved geometry. Consequently, 3D GANs have not yet been able to fully resolve the rich 3D geometry present in 2D images. In this work, we propose techniques to scale neural volume rendering to the much higher resolution of native 2D images, thereby resolving fine-grained 3D geometry with unprecedented detail. Our approach employs learning-based samplers for accelerating neural rendering for 3D GAN training using up to 5 times fewer depth samples. This enables us to explicitly "render every pixel" of the full-resolution image during training and inference without post-processing superresolution in 2D. Together with our strategy to learn high-quality surface geometry, our method synthesizes high-resolution 3D geometry and strictly view-consistent images while maintaining image quality on par with baselines relying on post-processing super resolution. We demonstrate state-of-the-art 3D gemetric quality on FFHQ and AFHQ, setting a new standard for unsupervised learning of 3D shapes in 3D GANs.

Related papers

Bridging Diffusion Models and 3D Representations: A 3D Consistent Super-Resolution Framework [53.251525710625096]
3D Super Resolution (3DSR)<n>Novel 3D Gaussian-splatting-based super-resolution framework.<n>We evaluate 3DSR on MipNeRF360 and LLFF data.
arXiv Detail & Related papers (2025-08-06T05:12:02Z)
Constructing a 3D Town from a Single Image [23.231661811526955]
3DTown is a training-free framework designed to synthesize realistic and coherent 3D scenes from a single top-down view.<n>We decompose the input image into overlapping regions and generate each using a pretrained 3D object generator.<n>Our results demonstrate that high-quality 3D town generation is achievable from a single image using a principled, training-free approach.
arXiv Detail & Related papers (2025-05-21T17:10:47Z)
Edify 3D: Scalable High-Quality 3D Asset Generation [53.86838858460809]
Edify 3D is an advanced solution designed for high-quality 3D asset generation. Our method can generate high-quality 3D assets with detailed geometry, clean shape topologies, high-resolution textures, and materials within 2 minutes of runtime.
arXiv Detail & Related papers (2024-11-11T17:07:43Z)
Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild. Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture. We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z)
Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models [112.2625368640425]
High-resolution Image-to-3D model (Hi3D) is a new video diffusion based paradigm that redefines a single image to multi-view images as 3D-aware sequential image generation. Hi3D first empowers the pre-trained video diffusion model with 3D-aware prior, yielding multi-view images with low-resolution texture details.
arXiv Detail & Related papers (2024-09-11T17:58:57Z)
Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation [29.959223778769513]
We propose a novel learning strategy, namely 3D-to-2D imitation, which enables a 3D-aware GAN to generate high-quality images. We also introduce 3D-aware convolutions into the generator for better 3D representation learning. With the above strategies, our method reaches FID scores of 5.4 and 4.3 on FFHQ and AFHQ-v2 Cats, respectively, at 512x512 resolution.
arXiv Detail & Related papers (2023-03-16T02:18:41Z)
Self-Supervised Geometry-Aware Encoder for Style-Based 3D GAN Inversion [115.82306502822412]
StyleGAN has achieved great progress in 2D face reconstruction and semantic editing via image inversion and latent editing. A corresponding generic 3D GAN inversion framework is still missing, limiting the applications of 3D face reconstruction and semantic editing. We study the challenging problem of 3D GAN inversion where a latent code is predicted given a single face image to faithfully recover its 3D shapes and detailed textures.
arXiv Detail & Related papers (2022-12-14T18:49:50Z)
XDGAN: Multi-Modal 3D Shape Generation in 2D Space [60.46777591995821]
We propose a novel method to convert 3D shapes into compact 1-channel geometry images and leverage StyleGAN3 and image-to-image translation networks to generate 3D objects in 2D space. The generated geometry images are quick to convert to 3D meshes, enabling real-time 3D object synthesis, visualization and interactive editing. We show both quantitatively and qualitatively that our method is highly effective at various tasks such as 3D shape generation, single view reconstruction and shape manipulation, while being significantly faster and more flexible compared to recent 3D generative models.
arXiv Detail & Related papers (2022-10-06T15:54:01Z)
GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds [28.660893916203747]
This paper proposes a novel 3D-aware GAN that can generate high resolution images (up to 1024X1024) while keeping strict 3D consistency as in volume rendering. Our motivation is to achieve super-resolution directly in the 3D space to preserve 3D consistency. Experiments on FFHQ and AFHQv2 datasets show that our method can produce high-quality 3D-consistent results.
arXiv Detail & Related papers (2022-06-15T02:35:51Z)
StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation [34.01352591390208]
We introduce a high resolution, 3D-consistent image and shape generation technique which we call StyleSDF. Our method is trained on single-view RGB data only, and stands on the shoulders of StyleGAN2 for image generation.
arXiv Detail & Related papers (2021-12-21T18:45:45Z)
Efficient Geometry-aware 3D Generative Adversarial Networks [50.68436093869381]
Existing 3D GANs are either compute-intensive or make approximations that are not 3D-consistent. In this work, we improve the computational efficiency and image quality of 3D GANs without overly relying on these approximations. We introduce an expressive hybrid explicit-implicit network architecture that synthesizes not only high-resolution multi-view-consistent images in real time but also produces high-quality 3D geometry.
arXiv Detail & Related papers (2021-12-15T08:01:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.