RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection
- URL: http://arxiv.org/abs/2108.07794v1
- Date: Tue, 17 Aug 2021 17:56:12 GMT
- Title: RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection
- Authors: Yongming Rao, Benlin Liu, Yi Wei, Jiwen Lu, Cho-Jui Hsieh, Jie Zhou
- Abstract summary: A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
- Score: 138.2892824662943
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D point cloud understanding has made great progress in recent years.
However, one major bottleneck is the scarcity of annotated real datasets,
especially compared to 2D object detection tasks, since a large amount of labor
is involved in annotating the real scans of a scene. A promising solution to
this problem is to make better use of the synthetic dataset, which consists of
CAD object models, to boost the learning on real datasets. This can be achieved
by the pre-training and fine-tuning procedure. However, recent work on 3D
pre-training exhibits failure when transfer features learned on synthetic
objects to other real-world applications. In this work, we put forward a new
method called RandomRooms to accomplish this objective. In particular, we
propose to generate random layouts of a scene by making use of the objects in
the synthetic CAD dataset and learn the 3D scene representation by applying
object-level contrastive learning on two random scenes generated from the same
set of synthetic objects. The model pre-trained in this way can serve as a
better initialization when later fine-tuning on the 3D object detection task.
Empirically, we show consistent improvement in downstream 3D detection tasks on
several base models, especially when less training data are used, which
strongly demonstrates the effectiveness and generalization of our method.
Benefiting from the rich semantic knowledge and diverse objects from synthetic
data, our method establishes the new state-of-the-art on widely-used 3D
detection benchmarks ScanNetV2 and SUN RGB-D. We expect our attempt to provide
a new perspective for bridging object and scene-level 3D understanding.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - SUGAR: Pre-training 3D Visual Representations for Robotics [85.55534363501131]
We introduce a novel 3D pre-training framework for robotics named SUGAR.
SUGAR captures semantic, geometric and affordance properties of objects through 3D point clouds.
We show that SUGAR's 3D representation outperforms state-of-the-art 2D and 3D representations.
arXiv Detail & Related papers (2024-04-01T21:23:03Z) - Randomized 3D Scene Generation for Generalizable Self-Supervised
Pre-Training [0.0]
We propose a new method to generate 3D scenes with spherical harmonics.
It surpasses the previous formula-driven method with a clear margin and achieves on-par results with methods using real-world scans and CAD models.
arXiv Detail & Related papers (2023-06-07T08:28:38Z) - Prompt-guided Scene Generation for 3D Zero-Shot Learning [8.658191774247944]
We propose a prompt-guided 3D scene generation and supervision method to augment 3D data to learn the network better.
First, we merge point clouds of two 3D models in certain ways described by a prompt. The prompt acts like the annotation describing each 3D scene.
We have achieved state-of-the-art ZSL and generalized ZSL performance on synthetic (ModelNet40, ModelNet10) and real-scanned (ScanOjbectNN) 3D object datasets.
arXiv Detail & Related papers (2022-09-29T11:24:33Z) - Few-shot Class-incremental Learning for 3D Point Cloud Objects [11.267975876074706]
Few-shot class-incremental learning (FSCIL) aims to incrementally fine-tune a model trained on base classes for a novel set of classes.
Recent efforts of FSCIL address this problem primarily on 2D image data.
Due to the advancement of camera technology, 3D point cloud data has become more available than ever.
arXiv Detail & Related papers (2022-05-30T16:33:53Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.