LASA: Instance Reconstruction from Real Scans using A Large-scale
Aligned Shape Annotation Dataset
- URL: http://arxiv.org/abs/2312.12418v1
- Date: Tue, 19 Dec 2023 18:50:10 GMT
- Title: LASA: Instance Reconstruction from Real Scans using A Large-scale
Aligned Shape Annotation Dataset
- Authors: Haolin Liu, Chongjie Ye, Yinyu Nie, Yingfan He, Xiaoguang Han
- Abstract summary: We present a novel Cross-Modal Shape Reconstruction (DisCo) method and an Occupancy-Guided 3D Object Detection (OccGOD) method.
Our methods achieve state-of-the-art performance in both instance-level scene reconstruction and 3D object detection tasks.
- Score: 17.530432165466507
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instance shape reconstruction from a 3D scene involves recovering the full
geometries of multiple objects at the semantic instance level. Many methods
leverage data-driven learning due to the intricacies of scene complexity and
significant indoor occlusions. Training these methods often requires a
large-scale, high-quality dataset with aligned and paired shape annotations
with real-world scans. Existing datasets are either synthetic or misaligned,
restricting the performance of data-driven methods on real data. To this end,
we introduce LASA, a Large-scale Aligned Shape Annotation Dataset comprising
10,412 high-quality CAD annotations aligned with 920 real-world scene scans
from ArkitScenes, created manually by professional artists. On this top, we
propose a novel Diffusion-based Cross-Modal Shape Reconstruction (DisCo)
method. It is empowered by a hybrid feature aggregation design to fuse
multi-modal inputs and recover high-fidelity object geometries. Besides, we
present an Occupancy-Guided 3D Object Detection (OccGOD) method and demonstrate
that our shape annotations provide scene occupancy clues that can further
improve 3D object detection. Supported by LASA, extensive experiments show that
our methods achieve state-of-the-art performance in both instance-level scene
reconstruction and 3D object detection tasks.
Related papers
- Category-level Object Detection, Pose Estimation and Reconstruction from Stereo Images [15.921719523588996]
Existing monocular and RGB-D methods suffer from scale ambiguity due to missing or depth measurements.
We present CODERS, a one-stage approach for Category-level Object Detection, pose Estimation and Reconstruction from Stereo images.
Our dataset, code, and demos will be available on our project page.
arXiv Detail & Related papers (2024-07-09T15:59:03Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Shape Anchor Guided Holistic Indoor Scene Understanding [9.463220988312218]
We propose a shape anchor guided learning strategy (AncLearn) for robust holistic indoor scene understanding.
AncLearn generates anchors that dynamically fit instance surfaces to (i) unmix noise and target-related features for offering reliable proposals at the detection stage.
We embed AncLearn into a reconstruction-from-detection learning system (AncRec) to generate high-quality semantic scene models.
arXiv Detail & Related papers (2023-09-20T08:30:20Z) - Weakly Supervised 3D Object Detection with Multi-Stage Generalization [62.96670547848691]
We introduce BA$2$-Det, encompassing pseudo label generation and multi-stage generalization.
We develop three stages of generalization: progressing from complete to partial, static to dynamic, and close to distant.
BA$2$-Det can achieve a 20% relative improvement on the KITTI dataset.
arXiv Detail & Related papers (2023-06-08T17:58:57Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - From Points to Multi-Object 3D Reconstruction [71.17445805257196]
We propose a method to detect and reconstruct multiple 3D objects from a single RGB image.
A keypoint detector localizes objects as center points and directly predicts all object properties, including 9-DoF bounding boxes and 3D shapes.
The presented approach performs lightweight reconstruction in a single-stage, it is real-time capable, fully differentiable and end-to-end trainable.
arXiv Detail & Related papers (2020-12-21T18:52:21Z) - DEF: Deep Estimation of Sharp Geometric Features in 3D Shapes [43.853000396885626]
We propose a learning-based framework for predicting sharp geometric features in sampled 3D shapes.
By fusing the result of individual patches, we can process large 3D models, which are impossible to process for existing data-driven methods.
arXiv Detail & Related papers (2020-11-30T18:21:00Z) - DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes [54.239416488865565]
We propose a fast single-stage 3D object detection method for LIDAR data.
The core novelty of our method is a fast, single-pass architecture that both detects objects in 3D and estimates their shapes.
We find that our proposed method achieves state-of-the-art results by 5% on object detection in ScanNet scenes, and it gets top results by 3.4% in the Open dataset.
arXiv Detail & Related papers (2020-04-02T17:48:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.