Automatically Annotating Indoor Images with CAD Models via RGB-D Scans
- URL: http://arxiv.org/abs/2212.11796v1
- Date: Thu, 22 Dec 2022 15:27:25 GMT
- Title: Automatically Annotating Indoor Images with CAD Models via RGB-D Scans
- Authors: Stefan Ainetter, Sinisa Stekovic, Friedrich Fraundorfer, Vincent
Lepetit
- Abstract summary: We present an automatic method for annotating images of indoor scenes with the CAD models of the objects by relying on RGB-D scans.
We show that our method retrieves annotations that are at least as accurate as manual annotations, and can thus be used as ground truth without the burden of manually annotating 3D data.
- Score: 36.52980906432878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an automatic method for annotating images of indoor scenes with
the CAD models of the objects by relying on RGB-D scans. Through a visual
evaluation by 3D experts, we show that our method retrieves annotations that
are at least as accurate as manual annotations, and can thus be used as ground
truth without the burden of manually annotating 3D data. We do this using an
analysis-by-synthesis approach, which compares renderings of the CAD models
with the captured scene. We introduce a 'cloning procedure' that identifies
objects that have the same geometry, to annotate these objects with the same
CAD models. This allows us to obtain complete annotations for the ScanNet
dataset and the recent ARKitScenes dataset.
Related papers
- Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding [29.147693306652414]
We show that data generated by automatic retrieval of synthetic CAD models can be used as high-quality ground truth for training supervised deep learning models.
Our results underscore the potential of automatic 3D annotations to enhance model performance while significantly reducing annotation costs.
To support future research in 3D scene understanding, we will release our annotations, which we call SCANnotate++, along with our trained models.
arXiv Detail & Related papers (2025-04-18T09:33:45Z) - Sparse Multi-Object Render-and-Compare [33.97243145891282]
Reconstructing 3D shape and pose of static objects from a single image is an essential task for various industries.
Directly predicting 3D shapes produces unrealistic, overly smoothed or tessellated shapes.
Retrieving CAD models ensures realistic shapes but requires robust and accurate alignment.
arXiv Detail & Related papers (2023-10-17T12:01:32Z) - Visual Localization using Imperfect 3D Models from the Internet [54.731309449883284]
This paper studies how imperfections in 3D models affect localization accuracy.
We show that 3D models from the Internet show promise as an easy-to-obtain scene representation.
arXiv Detail & Related papers (2023-04-12T16:15:05Z) - Templates for 3D Object Pose Estimation Revisited: Generalization to New
Objects and Robustness to Occlusions [79.34847067293649]
We present a method that can recognize new objects and estimate their 3D pose in RGB images even under partial occlusions.
It relies on a small set of training objects to learn local object representations.
We are the first to show generalization without retraining on the LINEMOD and Occlusion-LINEMOD datasets.
arXiv Detail & Related papers (2022-03-31T17:50:35Z) - Weakly-Supervised End-to-End CAD Retrieval to Scan Objects [25.41908065938424]
We propose a new weakly-supervised approach to retrieve semantically and structurally similar CAD models to a query 3D scanned scene.
Our approach leverages a fully-differentiable top-$k$ retrieval layer, enabling end-to-end training guided by geometric and perceptual similarity of the top retrieved CAD models to the scan queries.
arXiv Detail & Related papers (2022-03-24T06:30:47Z) - Patch2CAD: Patchwise Embedding Learning for In-the-Wild Shape Retrieval
from a Single Image [58.953160501596805]
We propose a novel approach towards constructing a joint embedding space between 2D images and 3D CAD models in a patch-wise fashion.
Our approach is more robust than state of the art in real-world scenarios without any exact CAD matches.
arXiv Detail & Related papers (2021-08-20T20:58:52Z) - 'CADSketchNet' -- An Annotated Sketch dataset for 3D CAD Model Retrieval
with Deep Neural Networks [0.8155575318208631]
The research work presented in this paper aims at developing a dataset suitable for building a retrieval system for 3D CAD models based on deep learning.
The paper also aims at evaluating the performance of various retrieval system or a search engine for 3D CAD models that accepts a sketch image as the input query.
arXiv Detail & Related papers (2021-07-13T16:10:16Z) - 3D Object Detection and Pose Estimation of Unseen Objects in Color
Images with Local Surface Embeddings [35.769234123059086]
We present an approach for detecting and estimating the 3D poses of objects in images that requires only an untextured CAD model.
Our approach combines Deep Learning and 3D geometry: It relies on an embedding of local 3D geometry to match the CAD models to the input images.
We show that we can use Mask-RCNN in a class-agnostic way to detect the new objects without retraining and thus drastically limit the number of possible correspondences.
arXiv Detail & Related papers (2020-10-08T15:57:06Z) - Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve [54.054575408582565]
We propose to leverage existing large-scale datasets of 3D models to understand the underlying 3D structure of objects seen in an image.
We present Mask2CAD, which jointly detects objects in real-world images and for each detected object, optimize for the most similar CAD model and its pose.
This produces a clean, lightweight representation of the objects in an image.
arXiv Detail & Related papers (2020-07-26T00:08:37Z) - DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data.
The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes.
We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.