HoloFusion: Towards Photo-realistic 3D Generative Modeling
- URL: http://arxiv.org/abs/2308.14244v1
- Date: Mon, 28 Aug 2023 01:19:33 GMT
- Title: HoloFusion: Towards Photo-realistic 3D Generative Modeling
- Authors: Animesh Karnewar and Niloy J. Mitra and Andrea Vedaldi and David
Novotny
- Abstract summary: Diffusion-based image generators can now produce high-quality and diverse samples, but their success has yet to fully translate to 3D generation.
We present HoloFusion, a method that combines the best of these approaches to produce high-fidelity, plausible, and diverse 3D samples.
- Score: 77.03830223281787
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based image generators can now produce high-quality and diverse
samples, but their success has yet to fully translate to 3D generation:
existing diffusion methods can either generate low-resolution but 3D consistent
outputs, or detailed 2D views of 3D objects but with potential structural
defects and lacking view consistency or realism. We present HoloFusion, a
method that combines the best of these approaches to produce high-fidelity,
plausible, and diverse 3D samples while learning from a collection of
multi-view 2D images only. The method first generates coarse 3D samples using a
variant of the recently proposed HoloDiffusion generator. Then, it
independently renders and upsamples a large number of views of the coarse 3D
model, super-resolves them to add detail, and distills those into a single,
high-fidelity implicit 3D representation, which also ensures view consistency
of the final renders. The super-resolution network is trained as an integral
part of HoloFusion, end-to-end, and the final distillation uses a new sampling
scheme to capture the space of super-resolved signals. We compare our method
against existing baselines, including DreamFusion, Get3D, EG3D, and
HoloDiffusion, and achieve, to the best of our knowledge, the most realistic
results on the challenging CO3Dv2 dataset.
Related papers
- Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild.
Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture.
We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z) - Hi3D: Pursuing High-Resolution Image-to-3D Generation with Video Diffusion Models [112.2625368640425]
High-resolution Image-to-3D model (Hi3D) is a new video diffusion based paradigm that redefines a single image to multi-view images as 3D-aware sequential image generation.
Hi3D first empowers the pre-trained video diffusion model with 3D-aware prior, yielding multi-view images with low-resolution texture details.
arXiv Detail & Related papers (2024-09-11T17:58:57Z) - GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation [54.27399121779011]
We present MVD-Fusion: a method for single-view 3D inference via generative modeling of multi-view-consistent RGB-D images.
We show that our approach can yield more accurate synthesis compared to recent state-of-the-art, including distillation-based 3D inference and prior multi-view generation methods.
arXiv Detail & Related papers (2024-04-04T17:59:57Z) - 3D-SceneDreamer: Text-Driven 3D-Consistent Scene Generation [51.64796781728106]
We propose a generative refinement network to synthesize new contents with higher quality by exploiting the natural image prior to 2D diffusion model and the global 3D information of the current scene.
Our approach supports wide variety of scene generation and arbitrary camera trajectories with improved visual quality and 3D consistency.
arXiv Detail & Related papers (2024-03-14T14:31:22Z) - Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior [57.986512832738704]
We present a new framework Sculpt3D that equips the current pipeline with explicit injection of 3D priors from retrieved reference objects without re-training the 2D diffusion model.
Specifically, we demonstrate that high-quality and diverse 3D geometry can be guaranteed by keypoints supervision through a sparse ray sampling approach.
These two decoupled designs effectively harness 3D information from reference objects to generate 3D objects while preserving the generation quality of the 2D diffusion model.
arXiv Detail & Related papers (2024-03-14T07:39:59Z) - GRAM-HD: 3D-Consistent Image Generation at High Resolution with
Generative Radiance Manifolds [28.660893916203747]
This paper proposes a novel 3D-aware GAN that can generate high resolution images (up to 1024X1024) while keeping strict 3D consistency as in volume rendering.
Our motivation is to achieve super-resolution directly in the 3D space to preserve 3D consistency.
Experiments on FFHQ and AFHQv2 datasets show that our method can produce high-quality 3D-consistent results.
arXiv Detail & Related papers (2022-06-15T02:35:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.