Contrastive Learning for Self-Supervised Pre-Training of Point Cloud
Segmentation Networks With Image Data
- URL: http://arxiv.org/abs/2301.07283v3
- Date: Mon, 4 Sep 2023 18:38:51 GMT
- Title: Contrastive Learning for Self-Supervised Pre-Training of Point Cloud
Segmentation Networks With Image Data
- Authors: Andrej Janda, Brandon Wagstaff, Edwin G. Ng, and Jonathan Kelly
- Abstract summary: Self-supervised pre-training on unlabelled data is one way to reduce the amount of manual annotations needed.
We combine image and point cloud modalities by first learning self-supervised image features and then using these features to train a 3D model.
Our pre-training method only requires a single scan of a scene and can be applied to cases where localization information is unavailable.
- Score: 7.145862669763328
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reducing the quantity of annotations required for supervised training is
vital when labels are scarce and costly. This reduction is particularly
important for semantic segmentation tasks involving 3D datasets, which are
often significantly smaller and more challenging to annotate than their
image-based counterparts. Self-supervised pre-training on unlabelled data is
one way to reduce the amount of manual annotations needed. Previous work has
focused on pre-training with point clouds exclusively. While useful, this
approach often requires two or more registered views. In the present work, we
combine image and point cloud modalities by first learning self-supervised
image features and then using these features to train a 3D model. By
incorporating image data, which is often included in many 3D datasets, our
pre-training method only requires a single scan of a scene and can be applied
to cases where localization information is unavailable. We demonstrate that our
pre-training approach, despite using single scans, achieves comparable
performance to other multi-scan, point cloud-only methods.
Related papers
- Shelf-Supervised Cross-Modal Pre-Training for 3D Object Detection [52.66283064389691]
State-of-the-art 3D object detectors are often trained on massive labeled datasets.
Recent works demonstrate that self-supervised pre-training with unlabeled data can improve detection accuracy with limited labels.
We propose a shelf-supervised approach for generating zero-shot 3D bounding boxes from paired RGB and LiDAR data.
arXiv Detail & Related papers (2024-06-14T15:21:57Z) - PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds [18.840000859663153]
We propose PRED, a novel image-assisted pre-training framework for outdoor point clouds.
The main ingredient of our framework is a Birds-Eye-View (BEV) feature map conditioned semantic rendering.
We further enhance our model's performance by incorporating point-wise masking with a high mask ratio.
arXiv Detail & Related papers (2023-11-08T07:26:09Z) - You Only Need One Thing One Click: Self-Training for Weakly Supervised
3D Scene Understanding [107.06117227661204]
We propose One Thing One Click'', meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our model can be compatible to 3D instance segmentation equipped with a point-clustering strategy.
arXiv Detail & Related papers (2023-03-26T13:57:00Z) - Self-Supervised Pre-training of 3D Point Cloud Networks with Image Data [6.121574833847397]
Self-supervised pre-training on large unlabelled datasets is one way to reduce the amount of manual annotations needed.
In the present work, we combine image and point cloud modalities, by first learning self-supervised image features and then using these features to train a 3D model.
By incorporating image data, which is often included in many 3D datasets, our pre-training method only requires a single scan of a scene.
arXiv Detail & Related papers (2022-11-21T19:09:52Z) - Image Understands Point Cloud: Weakly Supervised 3D Semantic
Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.
Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels.
Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z) - Active Self-Training for Weakly Supervised 3D Scene Semantic
Segmentation [17.27850877649498]
We introduce a method for weakly supervised segmentation of 3D scenes that combines self-training and active learning.
We demonstrate that our approach leads to an effective method that provides improvements in scene segmentation over previous works and baselines.
arXiv Detail & Related papers (2022-09-15T06:00:25Z) - Self-Supervised Pretraining for 2D Medical Image Segmentation [0.0]
Self-supervised learning offers a way to lower the need for manually annotated data by pretraining models for a specific domain on unlabelled data.
We find that self-supervised pretraining on natural images and target-domain-specific images leads to the fastest and most stable downstream convergence.
In low-data scenarios, supervised ImageNet pretraining achieves the best accuracy, requiring less than 100 annotated samples to realise close to minimal error.
arXiv Detail & Related papers (2022-09-01T09:25:22Z) - P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with
Point-to-Pixel Prompting [94.11915008006483]
We propose a novel Point-to-Pixel prompting for point cloud analysis.
Our method attains 89.3% accuracy on the hardest setting of ScanObjectNN.
Our framework also exhibits very competitive performance on ModelNet classification and ShapeNet Part Code.
arXiv Detail & Related papers (2022-08-04T17:59:03Z) - MaskSplit: Self-supervised Meta-learning for Few-shot Semantic
Segmentation [10.809349710149533]
We propose a self-supervised training approach for learning few-shot segmentation models.
We first use unsupervised saliency estimation to obtain pseudo-masks on images.
We then train a simple prototype based model over different splits of pseudo masks and augmentations of images.
arXiv Detail & Related papers (2021-10-23T12:30:05Z) - One Thing One Click: A Self-Training Approach for Weakly Supervised 3D
Semantic Segmentation [78.36781565047656]
We propose "One Thing One Click," meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our results are also comparable to those of the fully supervised counterparts.
arXiv Detail & Related papers (2021-04-06T02:27:25Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.