EpiGRAF: Rethinking training of 3D GANs
- URL: http://arxiv.org/abs/2206.10535v1
- Date: Tue, 21 Jun 2022 17:08:23 GMT
- Title: EpiGRAF: Rethinking training of 3D GANs
- Authors: Ivan Skorokhodov, Sergey Tulyakov, Yiqun Wang, Peter Wonka
- Abstract summary: We show that it is possible to obtain a high-resolution 3D generator with SotA image quality by following a completely different route of simply training the model patch-wise.
The resulting model, named EpiGRAF, is an efficient, high-resolution, pure 3D generator.
- Score: 60.38818140637367
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A very recent trend in generative modeling is building 3D-aware generators
from 2D image collections. To induce the 3D bias, such models typically rely on
volumetric rendering, which is expensive to employ at high resolutions. During
the past months, there appeared more than 10 works that address this scaling
issue by training a separate 2D decoder to upsample a low-resolution image (or
a feature tensor) produced from a pure 3D generator. But this solution comes at
a cost: not only does it break multi-view consistency (i.e. shape and texture
change when the camera moves), but it also learns the geometry in a low
fidelity. In this work, we show that it is possible to obtain a high-resolution
3D generator with SotA image quality by following a completely different route
of simply training the model patch-wise. We revisit and improve this
optimization scheme in two ways. First, we design a location- and scale-aware
discriminator to work on patches of different proportions and spatial
positions. Second, we modify the patch sampling strategy based on an annealed
beta distribution to stabilize training and accelerate the convergence. The
resulted model, named EpiGRAF, is an efficient, high-resolution, pure 3D
generator, and we test it on four datasets (two introduced in this work) at
$256^2$ and $512^2$ resolutions. It obtains state-of-the-art image quality,
high-fidelity geometry and trains ${\approx} 2.5 \times$ faster than the
upsampler-based counterparts. Project website:
https://universome.github.io/epigraf.
Related papers
- SuperGaussian: Repurposing Video Models for 3D Super Resolution [67.19266415499139]
We present a simple, modular, and generic method that upsamples coarse 3D models by adding geometric and appearance details.
We demonstrate that it is possible to directly repurpose existing (pretrained) video models for 3D super-resolution.
arXiv Detail & Related papers (2024-06-02T03:44:50Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - 3D generation on ImageNet [76.0440752186121]
We develop a 3D generator with Generic Priors (3DGP): a 3D synthesis framework with more general assumptions about the training data.
Our model is based on three new ideas.
We explore our model on four datasets: SDIP Dogs 256x256, SDIP Elephants 256x256, LSUN Horses 256x256, and ImageNet 256x256.
arXiv Detail & Related papers (2023-03-02T17:06:57Z) - Improving 3D-aware Image Synthesis with A Geometry-aware Discriminator [68.0533826852601]
3D-aware image synthesis aims at learning a generative model that can render photo-realistic 2D images while capturing decent underlying 3D shapes.
Existing methods fail to obtain moderate 3D shapes.
We propose a geometry-aware discriminator to improve 3D-aware GANs.
arXiv Detail & Related papers (2022-09-30T17:59:37Z) - GRAM-HD: 3D-Consistent Image Generation at High Resolution with
Generative Radiance Manifolds [28.660893916203747]
This paper proposes a novel 3D-aware GAN that can generate high resolution images (up to 1024X1024) while keeping strict 3D consistency as in volume rendering.
Our motivation is to achieve super-resolution directly in the 3D space to preserve 3D consistency.
Experiments on FFHQ and AFHQv2 datasets show that our method can produce high-quality 3D-consistent results.
arXiv Detail & Related papers (2022-06-15T02:35:51Z) - CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent
Pixel Synthesis [148.4104739574094]
This paper presents CIPS-3D, a style-based, 3D-aware generator that is composed of a shallow NeRF network and a deep implicit neural representation network.
The generator synthesizes each pixel value independently without any spatial convolution or upsampling operation.
It sets new records for 3D-aware image synthesis with an impressive FID of 6.97 for images at the $256times256$ resolution on FFHQ.
arXiv Detail & Related papers (2021-10-19T08:02:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.