Point Cloud Pre-training by Mixing and Disentangling
- URL: http://arxiv.org/abs/2109.00452v1
- Date: Wed, 1 Sep 2021 15:52:18 GMT
- Title: Point Cloud Pre-training by Mixing and Disentangling
- Authors: Chao Sun, Zhedong Zheng and Yi Yang
- Abstract summary: Mixing and Disentangling (MD) is a self-supervised learning approach for point cloud pre-training.
We show that the encoder + ours (MD) significantly surpasses that of the encoder trained from scratch and converges quickly.
We hope this self-supervised learning attempt on point clouds can pave the way for reducing the deeply-learned model dependence on large-scale labeled data.
- Score: 35.18101910728478
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The annotation for large-scale point clouds is still time-consuming and
unavailable for many real-world tasks. Point cloud pre-training is one
potential solution for obtaining a scalable model for fast adaptation.
Therefore, in this paper, we investigate a new self-supervised learning
approach, called Mixing and Disentangling (MD), for point cloud pre-training.
As the name implies, we explore how to separate the original point cloud from
the mixed point cloud, and leverage this challenging task as a pretext
optimization objective for model training. Considering the limited training
data in the original dataset, which is much less than prevailing ImageNet, the
mixing process can efficiently generate more high-quality samples. We build one
baseline network to verify our intuition, which simply contains two modules,
encoder and decoder. Given a mixed point cloud, the encoder is first
pre-trained to extract the semantic embedding. Then an instance-adaptive
decoder is harnessed to disentangle the point clouds according to the
embedding. Albeit simple, the encoder is inherently able to capture the point
cloud keypoints after training and can be fast adapted to downstream tasks
including classification and segmentation by the pre-training and fine-tuning
paradigm. Extensive experiments on two datasets show that the encoder + ours
(MD) significantly surpasses that of the encoder trained from scratch and
converges quickly. In ablation studies, we further study the effect of each
component and discuss the advantages of the proposed self-supervised learning
strategy. We hope this self-supervised learning attempt on point clouds can
pave the way for reducing the deeply-learned model dependence on large-scale
labeled data and saving a lot of annotation costs in the future.
Related papers
- Point Cloud Pre-training with Diffusion Models [62.12279263217138]
We propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif)
PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection.
arXiv Detail & Related papers (2023-11-25T08:10:05Z) - PRED: Pre-training via Semantic Rendering on LiDAR Point Clouds [18.840000859663153]
We propose PRED, a novel image-assisted pre-training framework for outdoor point clouds.
The main ingredient of our framework is a Birds-Eye-View (BEV) feature map conditioned semantic rendering.
We further enhance our model's performance by incorporating point-wise masking with a high mask ratio.
arXiv Detail & Related papers (2023-11-08T07:26:09Z) - GeoMAE: Masked Geometric Target Prediction for Self-supervised Point
Cloud Pre-Training [16.825524577372473]
We introduce a point cloud representation learning framework, based on geometric feature reconstruction.
We identify three self-supervised learning objectives to peculiar point clouds, namely centroid prediction, normal estimation, and curvature prediction.
Our pipeline is conceptually simple and it consists of two major steps: first, it randomly masks out groups of points, followed by a Transformer-based point cloud encoder.
arXiv Detail & Related papers (2023-05-15T17:14:55Z) - Weakly Supervised Semantic Segmentation for Large-Scale Point Cloud [69.36717778451667]
Existing methods for large-scale point cloud semantic segmentation require expensive, tedious and error-prone manual point-wise annotations.
We propose an effective weakly supervised method containing two components to solve the problem.
The experimental results show the large gain against existing weakly supervised and comparable results to fully supervised methods.
arXiv Detail & Related papers (2022-12-09T09:42:26Z) - EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder [60.52613206271329]
This paper introduces textbfEfficient textbfPoint textbfCloud textbfLearning (EPCL) for training high-quality point cloud models with a frozen CLIP transformer.
Our EPCL connects the 2D and 3D modalities by semantically aligning the image features and point cloud features without paired 2D-3D data.
arXiv Detail & Related papers (2022-12-08T06:27:11Z) - PointCaM: Cut-and-Mix for Open-Set Point Cloud Learning [72.07350827773442]
We propose to solve open-set point cloud learning using a novel Point Cut-and-Mix mechanism.
We use the Unknown-Point Simulator to simulate out-of-distribution data in the training stage.
The Unknown-Point Estimator module learns to exploit the point cloud's feature context for discriminating the known and unknown data.
arXiv Detail & Related papers (2022-12-05T03:53:51Z) - Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit
Neural Representation [79.60988242843437]
We propose a novel approach that achieves self-supervised and magnification-flexible point clouds upsampling simultaneously.
Experimental results demonstrate that our self-supervised learning based scheme achieves competitive or even better performance than supervised learning based state-of-the-art methods.
arXiv Detail & Related papers (2022-04-18T07:18:25Z) - Upsampling Autoencoder for Self-Supervised Point Cloud Learning [11.19408173558718]
We propose a self-supervised pretraining model for point cloud learning without human annotations.
Upsampling operation encourages the network to capture both high-level semantic information and low-level geometric information of the point cloud.
We find that our UAE outperforms previous state-of-the-art methods in shape classification, part segmentation and point cloud upsampling tasks.
arXiv Detail & Related papers (2022-03-21T07:20:37Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - Unsupervised Point Cloud Pre-Training via Occlusion Completion [18.42664414305454]
We describe a simple pre-training approach for point clouds.
It works in three steps: Mask all points occluded in a camera view; 2. Learn an encoder-decoder model to reconstruct the occluded points; 3. Use the encoder weights as initialisation for downstream point cloud tasks.
arXiv Detail & Related papers (2020-10-02T16:43:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.