SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
- URL: http://arxiv.org/abs/2006.14660v2
- Date: Wed, 28 Apr 2021 15:15:45 GMT
- Title: SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
- Authors: Angela Dai, Yawar Siddiqui, Justus Thies, Julien Valentin, Matthias
Nie{\ss}ner
- Abstract summary: SPSG is a novel approach to generate high-quality, colored 3D models of scenes from RGB-D scan observations.
Our self-supervised approach learns to jointly inpaint geometry and color by correlating an incomplete RGB-D scan with a more complete version of that scan.
- Score: 34.397726189729994
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present SPSG, a novel approach to generate high-quality, colored 3D models
of scenes from RGB-D scan observations by learning to infer unobserved scene
geometry and color in a self-supervised fashion. Our self-supervised approach
learns to jointly inpaint geometry and color by correlating an incomplete RGB-D
scan with a more complete version of that scan. Notably, rather than relying on
3D reconstruction losses to inform our 3D geometry and color reconstruction, we
propose adversarial and perceptual losses operating on 2D renderings in order
to achieve high-resolution, high-quality colored reconstructions of scenes.
This exploits the high-resolution, self-consistent signal from individual raw
RGB-D frames, in contrast to fused 3D reconstructions of the frames which
exhibit inconsistencies from view-dependent effects, such as color balancing or
pose inconsistencies. Thus, by informing our 3D scene generation directly
through 2D signal, we produce high-quality colored reconstructions of 3D
scenes, outperforming state of the art on both synthetic and real data.
Related papers
- Enhancing Single Image to 3D Generation using Gaussian Splatting and Hybrid Diffusion Priors [17.544733016978928]
3D object generation from a single image involves estimating the full 3D geometry and texture of unseen views from an unposed RGB image captured in the wild.
Recent advancements in 3D object generation have introduced techniques that reconstruct an object's 3D shape and texture.
We propose bridging the gap between 2D and 3D diffusion models to address this limitation.
arXiv Detail & Related papers (2024-10-12T10:14:11Z) - SpecGaussian with Latent Features: A High-quality Modeling of the View-dependent Appearance for 3D Gaussian Splatting [11.978842116007563]
Lantent-SpecGS is an approach that utilizes a universal latent neural descriptor within each 3D Gaussian.
Two parallel CNNs are designed to decoder the splatting feature maps into diffuse color and specular color separately.
A mask that depends on the viewpoint is learned to merge these two colors, resulting in the final rendered image.
arXiv Detail & Related papers (2024-08-23T15:25:08Z) - 2D Gaussian Splatting for Geometrically Accurate Radiance Fields [50.056790168812114]
3D Gaussian Splatting (3DGS) has recently revolutionized radiance field reconstruction, achieving high quality novel view synthesis and fast rendering speed without baking.
We present 2D Gaussian Splatting (2DGS), a novel approach to model and reconstruct geometrically accurate radiance fields from multi-view images.
We demonstrate that our differentiable terms allows for noise-free and detailed geometry reconstruction while maintaining competitive appearance quality, fast training speed, and real-time rendering.
arXiv Detail & Related papers (2024-03-26T17:21:24Z) - UNeR3D: Versatile and Scalable 3D RGB Point Cloud Generation from 2D
Images in Unsupervised Reconstruction [2.7848140839111903]
UNeR3D sets a new standard for generating detailed 3D reconstructions solely from 2D views.
Our model significantly cuts down the training costs tied to supervised approaches.
UNeR3D ensures seamless color transitions, enhancing visual fidelity.
arXiv Detail & Related papers (2023-12-10T15:18:55Z) - SSR-2D: Semantic 3D Scene Reconstruction from 2D Images [54.46126685716471]
In this work, we explore a central 3D scene modeling task, namely, semantic scene reconstruction without using any 3D annotations.
The key idea of our approach is to design a trainable model that employs both incomplete 3D reconstructions and their corresponding source RGB-D images.
Our method achieves the state-of-the-art performance of semantic scene completion on two large-scale benchmark datasets MatterPort3D and ScanNet.
arXiv Detail & Related papers (2023-02-07T17:47:52Z) - High-fidelity 3D GAN Inversion by Pseudo-multi-view Optimization [51.878078860524795]
We present a high-fidelity 3D generative adversarial network (GAN) inversion framework that can synthesize photo-realistic novel views.
Our approach enables high-fidelity 3D rendering from a single image, which is promising for various applications of AI-generated 3D content.
arXiv Detail & Related papers (2022-11-28T18:59:52Z) - PhotoScene: Photorealistic Material and Lighting Transfer for Indoor
Scenes [84.66946637534089]
PhotoScene is a framework that takes input image(s) of a scene and builds a photorealistic digital twin with high-quality materials and similar lighting.
We model scene materials using procedural material graphs; such graphs represent photorealistic and resolution-independent materials.
We evaluate our technique on objects and layout reconstructions from ScanNet, SUN RGB-D and stock photographs, and demonstrate that our method reconstructs high-quality, fully relightable 3D scenes.
arXiv Detail & Related papers (2022-07-02T06:52:44Z) - Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing [41.34640834483265]
We present PHORHUM, a novel, end-to-end trainable, deep neural network methodology for photorealistic 3D human reconstruction given just a monocular RGB image.
Our pixel-aligned method estimates detailed 3D geometry and, for the first time, the unshaded surface color together with the scene illumination.
arXiv Detail & Related papers (2022-04-19T14:06:16Z) - 3D-GIF: 3D-Controllable Object Generation via Implicit Factorized
Representations [31.095503715696722]
We propose the factorized representations which are view-independent and light-disentangled, and training schemes with randomly sampled light conditions.
We demonstrate the superiority of our method by visualizing factorized representations, re-lighted images, and albedo-textured meshes.
This is the first work that extracts albedo-textured meshes with unposed 2D images without any additional labels or assumptions.
arXiv Detail & Related papers (2022-03-12T15:23:17Z) - Urban Radiance Fields [77.43604458481637]
We perform 3D reconstruction and novel view synthesis from data captured by scanning platforms commonly deployed for world mapping in urban outdoor environments.
Our approach extends Neural Radiance Fields, which has been demonstrated to synthesize realistic novel images for small scenes in controlled settings.
Each of these three extensions provides significant performance improvements in experiments on Street View data.
arXiv Detail & Related papers (2021-11-29T15:58:16Z) - 3D Photography using Context-aware Layered Depth Inpainting [50.66235795163143]
We propose a method for converting a single RGB-D input image into a 3D photo.
A learning-based inpainting model synthesizes new local color-and-depth content into the occluded region.
The resulting 3D photos can be efficiently rendered with motion parallax.
arXiv Detail & Related papers (2020-04-09T17:59:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.