PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding
- URL: http://arxiv.org/abs/2007.10985v3
- Date: Sat, 21 Nov 2020 00:42:46 GMT
- Title: PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding
- Authors: Saining Xie, Jiatao Gu, Demi Guo, Charles R. Qi, Leonidas J. Guibas,
Or Litany
- Abstract summary: In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
- Score: 107.02479689909164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Arguably one of the top success stories of deep learning is transfer
learning. The finding that pre-training a network on a rich source set (eg.,
ImageNet) can help boost performance once fine-tuned on a usually much smaller
target set, has been instrumental to many applications in language and vision.
Yet, very little is known about its usefulness in 3D point cloud understanding.
We see this as an opportunity considering the effort required for annotating
data in 3D. In this work, we aim at facilitating research on 3D representation
learning. Different from previous works, we focus on high-level scene
understanding tasks. To this end, we select a suite of diverse datasets and
tasks to measure the effect of unsupervised pre-training on a large source set
of 3D scenes. Our findings are extremely encouraging: using a unified triplet
of architecture, source dataset, and contrastive loss for pre-training, we
achieve improvement over recent best results in segmentation and detection
across 6 different benchmarks for indoor and outdoor, real and synthetic
datasets -- demonstrating that the learned representation can generalize across
domains. Furthermore, the improvement was similar to supervised pre-training,
suggesting that future efforts should favor scaling data collection over more
detailed annotation. We hope these findings will encourage more research on
unsupervised pretext task design for 3D deep learning.
Related papers
- Bayesian Self-Training for Semi-Supervised 3D Segmentation [59.544558398992386]
3D segmentation is a core problem in computer vision.
densely labeling 3D point clouds to employ fully-supervised training remains too labor intensive and expensive.
Semi-supervised training provides a more practical alternative, where only a small set of labeled data is given, accompanied by a larger unlabeled set.
arXiv Detail & Related papers (2024-09-12T14:54:31Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - SPOT: Scalable 3D Pre-training via Occupancy Prediction for Learning Transferable 3D Representations [76.45009891152178]
Pretraining-finetuning approach can alleviate the labeling burden by fine-tuning a pre-trained backbone across various downstream datasets as well as tasks.
We show, for the first time, that general representations learning can be achieved through the task of occupancy prediction.
Our findings will facilitate the understanding of LiDAR points and pave the way for future advancements in LiDAR pre-training.
arXiv Detail & Related papers (2023-09-19T11:13:01Z) - Self-Supervised Learning with Multi-View Rendering for 3D Point Cloud
Analysis [33.31864436614945]
We propose a novel pre-training method for 3D point cloud models.
Our pre-training is self-supervised by a local pixel/point level correspondence loss and a global image/point cloud level loss.
These improved models outperform existing state-of-the-art methods on various datasets and downstream tasks.
arXiv Detail & Related papers (2022-10-28T05:23:03Z) - A Closer Look at Invariances in Self-supervised Pre-training for 3D
Vision [0.0]
Self-supervised pre-training for 3D vision has drawn increasing research interest in recent years.
We present a unified framework under which various pre-training methods can be investigated.
We propose a simple but effective method that jointly pre-trains a 3D encoder and a depth map encoder using contrastive learning.
arXiv Detail & Related papers (2022-07-11T16:44:15Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Self-Supervised Pretraining of 3D Features on any Point-Cloud [40.26575888582241]
We present a simple self-supervised pertaining method that can work with any 3D data without 3D registration.
We evaluate our models on 9 benchmarks for object detection, semantic segmentation, and object classification, where they achieve state-of-the-art results and can outperform supervised pretraining.
arXiv Detail & Related papers (2021-01-07T18:55:21Z) - Deep Learning for 3D Point Cloud Understanding: A Survey [16.35767262996978]
The development of practical applications, such as autonomous driving and robotics, has brought increasing attention to 3D point cloud understanding.
Deep learning has achieved remarkable success on image-based tasks, but there are many unique challenges faced by deep neural networks in processing massive, unstructured and noisy 3D points.
This paper summarizes recent remarkable research contributions in this area from several different directions.
arXiv Detail & Related papers (2020-09-18T16:34:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.