Self-Supervised Pre-training of 3D Point Cloud Networks with Image Data
- URL: http://arxiv.org/abs/2211.11801v1
- Date: Mon, 21 Nov 2022 19:09:52 GMT
- Title: Self-Supervised Pre-training of 3D Point Cloud Networks with Image Data
- Authors: Andrej Janda, Brandon Wagstaff, Edwin G. Ng, Jonathan Kelly
- Abstract summary: Self-supervised pre-training on large unlabelled datasets is one way to reduce the amount of manual annotations needed.
In the present work, we combine image and point cloud modalities, by first learning self-supervised image features and then using these features to train a 3D model.
By incorporating image data, which is often included in many 3D datasets, our pre-training method only requires a single scan of a scene.
- Score: 6.121574833847397
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reducing the quantity of annotations required for supervised training is
vital when labels are scarce and costly. This reduction is especially important
for semantic segmentation tasks involving 3D datasets that are often
significantly smaller and more challenging to annotate than their image-based
counterparts. Self-supervised pre-training on large unlabelled datasets is one
way to reduce the amount of manual annotations needed. Previous work has
focused on pre-training with point cloud data exclusively; this approach often
requires two or more registered views. In the present work, we combine image
and point cloud modalities, by first learning self-supervised image features
and then using these features to train a 3D model. By incorporating image data,
which is often included in many 3D datasets, our pre-training method only
requires a single scan of a scene. We demonstrate that our pre-training
approach, despite using single scans, achieves comparable performance to other
multi-scan, point cloud-only methods.
Related papers
- Shelf-Supervised Cross-Modal Pre-Training for 3D Object Detection [52.66283064389691]
State-of-the-art 3D object detectors are often trained on massive labeled datasets.
Recent works demonstrate that self-supervised pre-training with unlabeled data can improve detection accuracy with limited labels.
We propose a shelf-supervised approach for generating zero-shot 3D bounding boxes from paired RGB and LiDAR data.
arXiv Detail & Related papers (2024-06-14T15:21:57Z) - PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds [18.840000859663153]
We propose PRED, a novel image-assisted pre-training framework for outdoor point clouds.
The main ingredient of our framework is a Birds-Eye-View (BEV) feature map conditioned semantic rendering.
We further enhance our model's performance by incorporating point-wise masking with a high mask ratio.
arXiv Detail & Related papers (2023-11-08T07:26:09Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - You Only Need One Thing One Click: Self-Training for Weakly Supervised
3D Scene Understanding [107.06117227661204]
We propose One Thing One Click'', meaning that the annotator only needs to label one point per object.
We iteratively conduct the training and label propagation, facilitated by a graph propagation module.
Our model can be compatible to 3D instance segmentation equipped with a point-clustering strategy.
arXiv Detail & Related papers (2023-03-26T13:57:00Z) - Contrastive Learning for Self-Supervised Pre-Training of Point Cloud
Segmentation Networks With Image Data [7.145862669763328]
Self-supervised pre-training on unlabelled data is one way to reduce the amount of manual annotations needed.
We combine image and point cloud modalities by first learning self-supervised image features and then using these features to train a 3D model.
Our pre-training method only requires a single scan of a scene and can be applied to cases where localization information is unavailable.
arXiv Detail & Related papers (2023-01-18T03:14:14Z) - Image Understands Point Cloud: Weakly Supervised 3D Semantic
Segmentation via Association Learning [59.64695628433855]
We propose a novel cross-modality weakly supervised method for 3D segmentation, incorporating complementary information from unlabeled images.
Basically, we design a dual-branch network equipped with an active labeling strategy, to maximize the power of tiny parts of labels.
Our method even outperforms the state-of-the-art fully supervised competitors with less than 1% actively selected annotations.
arXiv Detail & Related papers (2022-09-16T07:59:04Z) - P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with
Point-to-Pixel Prompting [94.11915008006483]
We propose a novel Point-to-Pixel prompting for point cloud analysis.
Our method attains 89.3% accuracy on the hardest setting of ScanObjectNN.
Our framework also exhibits very competitive performance on ModelNet classification and ShapeNet Part Code.
arXiv Detail & Related papers (2022-08-04T17:59:03Z) - Self-Supervised Pretraining of 3D Features on any Point-Cloud [40.26575888582241]
We present a simple self-supervised pertaining method that can work with any 3D data without 3D registration.
We evaluate our models on 9 benchmarks for object detection, semantic segmentation, and object classification, where they achieve state-of-the-art results and can outperform supervised pretraining.
arXiv Detail & Related papers (2021-01-07T18:55:21Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.