HoloFusion: Towards Photo-realistic 3D Generative Modeling
- URL: http://arxiv.org/abs/2308.14244v1
- Date: Mon, 28 Aug 2023 01:19:33 GMT
- Title: HoloFusion: Towards Photo-realistic 3D Generative Modeling
- Authors: Animesh Karnewar and Niloy J. Mitra and Andrea Vedaldi and David
Novotny
- Abstract summary: Diffusion-based image generators can now produce high-quality and diverse samples, but their success has yet to fully translate to 3D generation.
We present HoloFusion, a method that combines the best of these approaches to produce high-fidelity, plausible, and diverse 3D samples.
- Score: 77.03830223281787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based image generators can now produce high-quality and diverse
samples, but their success has yet to fully translate to 3D generation:
existing diffusion methods can either generate low-resolution but 3D consistent
outputs, or detailed 2D views of 3D objects but with potential structural
defects and lacking view consistency or realism. We present HoloFusion, a
method that combines the best of these approaches to produce high-fidelity,
plausible, and diverse 3D samples while learning from a collection of
multi-view 2D images only. The method first generates coarse 3D samples using a
variant of the recently proposed HoloDiffusion generator. Then, it
independently renders and upsamples a large number of views of the coarse 3D
model, super-resolves them to add detail, and distills those into a single,
high-fidelity implicit 3D representation, which also ensures view consistency
of the final renders. The super-resolution network is trained as an integral
part of HoloFusion, end-to-end, and the final distillation uses a new sampling
scheme to capture the space of super-resolved signals. We compare our method
against existing baselines, including DreamFusion, Get3D, EG3D, and
HoloDiffusion, and achieve, to the best of our knowledge, the most realistic
results on the challenging CO3Dv2 dataset.
Related papers
- Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion Models [29.73743772971411]
We propose Human 3Diffusion: Realistic Avatar Creation via Explicit 3D Consistent Diffusion.
Our key insight is that 2D multi-view diffusion and 3D reconstruction models provide complementary information for each other.
Our proposed framework outperforms state-of-the-art methods and enables the creation of realistic avatars from a single RGB image.
arXiv Detail & Related papers (2024-06-12T17:57:25Z) - MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance [76.7746870349809]
We present ComboVerse, a 3D generation framework that produces high-quality 3D assets with complex compositions by learning to combine multiple models.
Our proposed framework emphasizes spatial alignment of objects, compared with standard score distillation sampling.
arXiv Detail & Related papers (2024-03-19T03:39:43Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - GVP: Generative Volumetric Primitives [76.95231302205235]
We present Generative Volumetric Primitives (GVP), the first pure 3D generative model that can sample and render 512-resolution images in real-time.
GVP jointly models a number of primitives and their spatial information, both of which can be efficiently generated via a 2D convolutional network.
Experiments on several datasets demonstrate superior efficiency and 3D consistency of GVP over the state-of-the-art.
arXiv Detail & Related papers (2023-03-31T16:50:23Z) - GRAM-HD: 3D-Consistent Image Generation at High Resolution with
Generative Radiance Manifolds [28.660893916203747]
This paper proposes a novel 3D-aware GAN that can generate high resolution images (up to 1024X1024) while keeping strict 3D consistency as in volume rendering.
Our motivation is to achieve super-resolution directly in the 3D space to preserve 3D consistency.
Experiments on FFHQ and AFHQv2 datasets show that our method can produce high-quality 3D-consistent results.
arXiv Detail & Related papers (2022-06-15T02:35:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.