SparseFusion: Distilling View-conditioned Diffusion for 3D
Reconstruction
- URL: http://arxiv.org/abs/2212.00792v2
- Date: Sun, 4 Dec 2022 17:23:15 GMT
- Title: SparseFusion: Distilling View-conditioned Diffusion for 3D
Reconstruction
- Authors: Zhizhuo Zhou, Shubham Tulsiani
- Abstract summary: We propose SparseFusion, a sparse view 3D reconstruction approach that unifies recent advances in neural rendering and probabilistic image generation.
Existing approaches typically build on neural rendering with re-projected features but fail to generate unseen regions or handle uncertainty under large viewpoint changes.
- Score: 26.165314261806603
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We propose SparseFusion, a sparse view 3D reconstruction approach that
unifies recent advances in neural rendering and probabilistic image generation.
Existing approaches typically build on neural rendering with re-projected
features but fail to generate unseen regions or handle uncertainty under large
viewpoint changes. Alternate methods treat this as a (probabilistic) 2D
synthesis task, and while they can generate plausible 2D images, they do not
infer a consistent underlying 3D. However, we find that this trade-off between
3D consistency and probabilistic image generation does not need to exist. In
fact, we show that geometric consistency and generative inference can be
complementary in a mode-seeking behavior. By distilling a 3D consistent scene
representation from a view-conditioned latent diffusion model, we are able to
recover a plausible 3D representation whose renderings are both accurate and
realistic. We evaluate our approach across 51 categories in the CO3D dataset
and show that it outperforms existing methods, in both distortion and
perception metrics, for sparse-view novel view synthesis.
Related papers
- GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - DiffHuman: Probabilistic Photorealistic 3D Reconstruction of Humans [38.8751809679184]
We present DiffHuman, a probabilistic method for 3D human reconstruction from a single RGB image.
Our experiments show that DiffHuman can produce diverse and detailed reconstructions for the parts of the person that are unseen or uncertain in the input image.
arXiv Detail & Related papers (2024-03-30T22:28:29Z) - Denoising Diffusion via Image-Based Rendering [54.20828696348574]
We introduce the first diffusion model able to perform fast, detailed reconstruction and generation of real-world 3D scenes.
First, we introduce a new neural scene representation, IB-planes, that can efficiently and accurately represent large 3D scenes.
Second, we propose a denoising-diffusion framework to learn a prior over this novel 3D scene representation, using only 2D images.
arXiv Detail & Related papers (2024-02-05T19:00:45Z) - WildFusion: Learning 3D-Aware Latent Diffusion Models in View Space [77.92350895927922]
We propose WildFusion, a new approach to 3D-aware image synthesis based on latent diffusion models (LDMs)
Our 3D-aware LDM is trained without any direct supervision from multiview images or 3D geometry.
This opens up promising research avenues for scalable 3D-aware image synthesis and 3D content creation from in-the-wild image data.
arXiv Detail & Related papers (2023-11-22T18:25:51Z) - Generative Novel View Synthesis with 3D-Aware Diffusion Models [96.78397108732233]
We present a diffusion-based model for 3D-aware generative novel view synthesis from as few as a single input image.
Our method makes use of existing 2D diffusion backbones but, crucially, incorporates geometry priors in the form of a 3D feature volume.
In addition to generating novel views, our method has the ability to autoregressively synthesize 3D-consistent sequences.
arXiv Detail & Related papers (2023-04-05T17:15:47Z) - 3inGAN: Learning a 3D Generative Model from Images of a Self-similar
Scene [34.2144933185175]
3inGAN is an unconditional 3D generative model trained from 2D images of a single self-similar 3D scene.
We show results on semi-stochastic scenes of varying scale and complexity, obtained from real and synthetic sources.
arXiv Detail & Related papers (2022-11-27T18:03:21Z) - RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and
Generation [68.06991943974195]
We present RenderDiffusion, the first diffusion model for 3D generation and inference, trained using only monocular 2D supervision.
We evaluate RenderDiffusion on FFHQ, AFHQ, ShapeNet and CLEVR datasets, showing competitive performance for generation of 3D scenes and inference of 3D scenes from 2D images.
arXiv Detail & Related papers (2022-11-17T20:17:04Z) - DreamFusion: Text-to-3D using 2D Diffusion [52.52529213936283]
Recent breakthroughs in text-to-image synthesis have been driven by diffusion models trained on billions of image-text pairs.
In this work, we circumvent these limitations by using a pretrained 2D text-to-image diffusion model to perform text-to-3D synthesis.
Our approach requires no 3D training data and no modifications to the image diffusion model, demonstrating the effectiveness of pretrained image diffusion models as priors.
arXiv Detail & Related papers (2022-09-29T17:50:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.