GaussianDiffusion: 3D Gaussian Splatting for Denoising Diffusion
Probabilistic Models with Structured Noise
- URL: http://arxiv.org/abs/2311.11221v1
- Date: Sun, 19 Nov 2023 04:26:16 GMT
- Title: GaussianDiffusion: 3D Gaussian Splatting for Denoising Diffusion
Probabilistic Models with Structured Noise
- Authors: Xinhai Li and Huaibin Wang and Kuo-Kun Tseng
- Abstract summary: This paper introduces a novel text to 3D content generation framework based on Gaussian splatting.
We employ multi-view noise distributions to perturb images generated by 3D Gaussian splatting.
To our knowledge, our approach represents the first comprehensive utilization of Gaussian splatting across the entire spectrum of 3D content generation processes.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-3D, known for its efficient generation methods and expansive creative
potential, has garnered significant attention in the AIGC domain. However, the
amalgamation of Nerf and 2D diffusion models frequently yields oversaturated
images, posing severe limitations on downstream industrial applications due to
the constraints of pixelwise rendering method. Gaussian splatting has recently
superseded the traditional pointwise sampling technique prevalent in NeRF-based
methodologies, revolutionizing various aspects of 3D reconstruction. This paper
introduces a novel text to 3D content generation framework based on Gaussian
splatting, enabling fine control over image saturation through individual
Gaussian sphere transparencies, thereby producing more realistic images. The
challenge of achieving multi-view consistency in 3D generation significantly
impedes modeling complexity and accuracy. Taking inspiration from SJC, we
explore employing multi-view noise distributions to perturb images generated by
3D Gaussian splatting, aiming to rectify inconsistencies in multi-view
geometry. We ingeniously devise an efficient method to generate noise that
produces Gaussian noise from diverse viewpoints, all originating from a shared
noise source. Furthermore, vanilla 3D Gaussian-based generation tends to trap
models in local minima, causing artifacts like floaters, burrs, or
proliferative elements. To mitigate these issues, we propose the variational
Gaussian splatting technique to enhance the quality and stability of 3D
appearance. To our knowledge, our approach represents the first comprehensive
utilization of Gaussian splatting across the entire spectrum of 3D content
generation processes.
Related papers
- GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - Effective Rank Analysis and Regularization for Enhanced 3D Gaussian Splatting [33.01987451251659]
3D Gaussian Splatting (3DGS) has emerged as a promising technique capable of real-time rendering with high-quality 3D reconstruction.
Despite its potential, 3DGS encounters challenges, including needle-like artifacts, suboptimal geometries, and inaccurate normals.
We introduce effective rank as a regularization, which constrains the structure of the Gaussians.
arXiv Detail & Related papers (2024-06-17T15:51:59Z) - Adversarial Generation of Hierarchical Gaussians for 3D Generative Model [20.833116566243408]
In this paper, we exploit Gaussian as a 3D representation for 3D GANs by leveraging its efficient and explicit characteristics.
We introduce a generator architecture with a hierarchical multi-scale Gaussian representation that effectively regularizes the position and scale of generated Gaussians.
Experimental results demonstrate that ours achieves a significantly faster rendering speed (x100) compared to state-of-the-art 3D consistent GANs.
arXiv Detail & Related papers (2024-06-05T05:52:20Z) - A Refined 3D Gaussian Representation for High-Quality Dynamic Scene Reconstruction [2.022451212187598]
In recent years, Neural Radiance Fields (NeRF) has revolutionized three-dimensional (3D) reconstruction with its implicit representation.
3D Gaussian Splatting (3D-GS) has departed from the implicit representation of neural networks and instead directly represents scenes as point clouds with Gaussian-shaped distributions.
This paper purposes a refined 3D Gaussian representation for high-quality dynamic scene reconstruction.
Experimental results demonstrate that our method surpasses existing approaches in rendering quality and speed, while significantly reducing the memory usage associated with 3D-GS.
arXiv Detail & Related papers (2024-05-28T07:12:22Z) - HO-Gaussian: Hybrid Optimization of 3D Gaussian Splatting for Urban Scenes [24.227745405760697]
We propose a hybrid optimization method named HO-Gaussian, which combines a grid-based volume with the 3DGS pipeline.
Results on widely used autonomous driving datasets demonstrate that HO-Gaussian achieves photo-realistic rendering in real-time on multi-camera urban datasets.
arXiv Detail & Related papers (2024-03-29T07:58:21Z) - GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling [55.05713977022407]
We introduce a radiance representation that is both structured and fully explicit and thus greatly facilitates 3D generative modeling.
We derive GaussianCube by first using a novel densification-constrained Gaussian fitting algorithm, which yields high-accuracy fitting.
Experiments conducted on unconditional and class-conditioned object generation, digital avatar creation, and text-to-3D all show that our model synthesis achieves state-of-the-art generation results.
arXiv Detail & Related papers (2024-03-28T17:59:50Z) - FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model [81.03553265684184]
We introduce FDGaussian, a novel two-stage framework for single-image 3D reconstruction.
Recent methods typically utilize pre-trained 2D diffusion models to generate plausible novel views from the input image.
We demonstrate that FDGaussian generates images with high consistency across different views and reconstructs high-quality 3D objects.
arXiv Detail & Related papers (2024-03-15T12:24:36Z) - Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting [9.383423119196408]
We introduce Multi-view ControlNet (MVControl), a novel neural network architecture designed to enhance existing multi-view diffusion models.
MVControl is able to offer 3D diffusion guidance for optimization-based 3D generation.
In pursuit of efficiency, we adopt 3D Gaussians as our representation instead of the commonly used implicit representations.
arXiv Detail & Related papers (2024-03-15T02:57:20Z) - Spec-Gaussian: Anisotropic View-Dependent Appearance for 3D Gaussian
Splatting [57.80942520483354]
3D-GS frequently encounters difficulties in accurately modeling specular and anisotropic components.
We introduce Spec-Gaussian, an approach that utilizes an anisotropic spherical Gaussian appearance field instead of spherical harmonics.
Our experimental results demonstrate that our method surpasses existing approaches in terms of rendering quality.
arXiv Detail & Related papers (2024-02-24T17:22:15Z) - GIR: 3D Gaussian Inverse Rendering for Relightable Scene Factorization [76.52007427483396]
GIR is a 3D Gaussian Inverse Rendering method for relightable scene factorization.
Our method utilizes 3D Gaussians to estimate the material properties, illumination, and geometry of an object from multi-view images.
arXiv Detail & Related papers (2023-12-08T16:05:15Z) - NeRF-GAN Distillation for Efficient 3D-Aware Generation with
Convolutions [97.27105725738016]
integration of Neural Radiance Fields (NeRFs) and generative models, such as Generative Adversarial Networks (GANs) has transformed 3D-aware generation from single-view images.
We propose a simple and effective method, based on re-using the well-disentangled latent space of a pre-trained NeRF-GAN in a pose-conditioned convolutional network to directly generate 3D-consistent images corresponding to the underlying 3D representations.
arXiv Detail & Related papers (2023-03-22T18:59:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.