SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
- URL: http://arxiv.org/abs/2003.12622v1
- Date: Fri, 27 Mar 2020 20:17:00 GMT
- Title: SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
- Authors: Armen Avetisyan, Tatiana Khanova, Christopher Choy, Denver Dash,
Angela Dai, Matthias Nie{\ss}ner
- Abstract summary: We present a novel approach to reconstructing lightweight, CAD-based representations of scanned 3D environments from commodity RGB-D sensors.
Our key idea is to jointly optimize for both CAD model alignments as well as layout estimations of the scanned scene.
- Score: 24.06640371472068
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a novel approach to reconstructing lightweight, CAD-based
representations of scanned 3D environments from commodity RGB-D sensors. Our
key idea is to jointly optimize for both CAD model alignments as well as layout
estimations of the scanned scene, explicitly modeling inter-relationships
between objects-to-objects and objects-to-layout. Since object arrangement and
scene layout are intrinsically coupled, we show that treating the problem
jointly significantly helps to produce globally-consistent representations of a
scene. Object CAD models are aligned to the scene by establishing dense
correspondences between geometry, and we introduce a hierarchical layout
prediction approach to estimate layout planes from corners and edges of the
scene.To this end, we propose a message-passing graph neural network to model
the inter-relationships between objects and layout, guiding generation of a
globally object alignment in a scene. By considering the global scene layout,
we achieve significantly improved CAD alignments compared to state-of-the-art
methods, improving from 41.83% to 58.41% alignment accuracy on SUNCG and from
50.05% to 61.24% on ScanNet, respectively. The resulting CAD-based
representations makes our method well-suited for applications in content
creation such as augmented- or virtual reality.
Related papers
- SceneWiz3D: Towards Text-guided 3D Scene Composition [134.71933134180782]
Existing approaches either leverage large text-to-image models to optimize a 3D representation or train 3D generators on object-centric datasets.
We introduce SceneWiz3D, a novel approach to synthesize high-fidelity 3D scenes from text.
arXiv Detail & Related papers (2023-12-13T18:59:30Z) - Sparse Multi-Object Render-and-Compare [33.97243145891282]
Reconstructing 3D shape and pose of static objects from a single image is an essential task for various industries.
Directly predicting 3D shapes produces unrealistic, overly smoothed or tessellated shapes.
Retrieving CAD models ensures realistic shapes but requires robust and accurate alignment.
arXiv Detail & Related papers (2023-10-17T12:01:32Z) - CommonScenes: Generating Commonsense 3D Indoor Scenes with Scene Graph
Diffusion [83.30168660888913]
We present CommonScenes, a fully generative model that converts scene graphs into corresponding controllable 3D scenes.
Our pipeline consists of two branches, one predicting the overall scene layout via a variational auto-encoder and the other generating compatible shapes.
The generated scenes can be manipulated by editing the input scene graph and sampling the noise in the diffusion model.
arXiv Detail & Related papers (2023-05-25T17:39:13Z) - Joint stereo 3D object detection and implicit surface reconstruction [39.30458073540617]
We present a new learning-based framework S-3D-RCNN that can recover accurate object orientation in SO(3) and simultaneously predict implicit rigid shapes from stereo RGB images.
For orientation estimation, in contrast to previous studies that map local appearance to observation angles, we propose a progressive approach by extracting meaningful Intermediate Geometrical Representations (IGRs)
This approach features a deep model that transforms perceived intensities from one or two views to object part coordinates to achieve direct egocentric object orientation estimation in the camera coordinate system.
To further achieve finer description inside 3D bounding boxes, we investigate the implicit shape estimation problem from stereo images
arXiv Detail & Related papers (2021-11-25T05:52:30Z) - Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval
from a Single Image [58.953160501596805]
We propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion.
Our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
arXiv Detail & Related papers (2021-08-20T20:58:52Z) - Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model
Alignments [81.38641691636847]
We rethink the problem of scene reconstruction from an embodied agent's perspective.
We reconstruct an interactive scene using RGB-D data stream.
This reconstructed scene replaces the object meshes in the dense panoptic map with part-based articulated CAD models.
arXiv Detail & Related papers (2021-03-30T05:56:58Z) - Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos [48.69114433364771]
We address the task of aligning CAD models to a video sequence of a complex scene containing multiple objects.
Our method is able to process arbitrary videos and fully automatically recover the 9 DoF pose for each object appearing in it, thus aligning them in a common 3D coordinate frame.
arXiv Detail & Related papers (2020-12-08T18:57:45Z) - Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image.
We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose.
This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.