Learning 3D Scene Priors with 2D Supervision
- URL: http://arxiv.org/abs/2211.14157v1
- Date: Fri, 25 Nov 2022 15:03:32 GMT
- Title: Learning 3D Scene Priors with 2D Supervision
- Authors: Yinyu Nie, Angela Dai, Xiaoguang Han, Matthias Nie{\ss}ner
- Abstract summary: We propose a new method to learn 3D scene priors of layout and shape without requiring any 3D ground truth.
Our method represents a 3D scene as a latent vector, from which we can progressively decode to a sequence of objects characterized by their class categories.
Experiments on 3D-FRONT and ScanNet show that our method outperforms state of the art in single-view reconstruction.
- Score: 37.79852635415233
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Holistic 3D scene understanding entails estimation of both layout
configuration and object geometry in a 3D environment. Recent works have shown
advances in 3D scene estimation from various input modalities (e.g., images, 3D
scans), by leveraging 3D supervision (e.g., 3D bounding boxes or CAD models),
for which collection at scale is expensive and often intractable. To address
this shortcoming, we propose a new method to learn 3D scene priors of layout
and shape without requiring any 3D ground truth. Instead, we rely on 2D
supervision from multi-view RGB images. Our method represents a 3D scene as a
latent vector, from which we can progressively decode to a sequence of objects
characterized by their class categories, 3D bounding boxes, and meshes. With
our trained autoregressive decoder representing the scene prior, our method
facilitates many downstream applications, including scene synthesis,
interpolation, and single-view reconstruction. Experiments on 3D-FRONT and
ScanNet show that our method outperforms state of the art in single-view
reconstruction, and achieves state-of-the-art results in scene synthesis
against baselines which require for 3D supervision.
Related papers
- General Geometry-aware Weakly Supervised 3D Object Detection [62.26729317523975]
A unified framework is developed for learning 3D object detectors from RGB images and associated 2D boxes.
Experiments on KITTI and SUN-RGBD datasets demonstrate that our method yields surprisingly high-quality 3D bounding boxes with only 2D annotation.
arXiv Detail & Related papers (2024-07-18T17:52:08Z) - SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - Model2Scene: Learning 3D Scene Representation via Contrastive
Language-CAD Models Pre-training [105.3421541518582]
Current successful methods of 3D scene perception rely on the large-scale annotated point cloud.
We propose Model2Scene, a novel paradigm that learns free 3D scene representation from Computer-Aided Design (CAD) models and languages.
Model2Scene yields impressive label-free 3D object salient detection with an average mAP of 46.08% and 55.49% on the ScanNet and S3DIS datasets, respectively.
arXiv Detail & Related papers (2023-09-29T03:51:26Z) - Neural 3D Scene Reconstruction from Multiple 2D Images without 3D
Supervision [41.20504333318276]
We propose a novel neural reconstruction method that reconstructs scenes using sparse depth under the plane constraints without 3D supervision.
We introduce a signed distance function field, a color field, and a probability field to represent a scene.
We optimize these fields to reconstruct the scene by using differentiable ray marching with accessible 2D images as supervision.
arXiv Detail & Related papers (2023-06-30T13:30:48Z) - NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization [80.3424839706698]
We present NeurOCS, a framework that uses instance masks 3D boxes as input to learn 3D object shapes by means of differentiable rendering.
Our approach rests on insights in learning a category-level shape prior directly from real driving scenes.
We make critical design choices to learn object coordinates more effectively from an object-centric view.
arXiv Detail & Related papers (2023-05-28T16:18:41Z) - CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [49.281006972028194]
We introduce CC3D, a conditional generative model that synthesizes complex 3D scenes conditioned on 2D semantic scene layouts.
Our evaluations on synthetic 3D-FRONT and real-world KITTI-360 datasets demonstrate that our model generates scenes of improved visual and geometric quality.
arXiv Detail & Related papers (2023-03-21T17:59:02Z) - 3inGAN: Learning a 3D Generative Model from Images of a Self-similar
Scene [34.2144933185175]
3inGAN is an unconditional 3D generative model trained from 2D images of a single self-similar 3D scene.
We show results on semi-stochastic scenes of varying scale and complexity, obtained from real and synthetic sources.
arXiv Detail & Related papers (2022-11-27T18:03:21Z) - Learning 3D Object Shape and Layout without 3D Supervision [26.575177430506667]
A 3D scene consists of a set of objects, each with a shape and a layout giving their position in space.
We propose a method that learns to predict 3D shape and layout for objects without any ground truth shape or layout information.
Our approach outperforms supervised approaches trained on smaller and less diverse datasets.
arXiv Detail & Related papers (2022-06-14T17:49:44Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.