Masked Autoencoders in 3D Point Cloud Representation Learning
- URL: http://arxiv.org/abs/2207.01545v2
- Date: Mon, 11 Sep 2023 11:33:58 GMT
- Title: Masked Autoencoders in 3D Point Cloud Representation Learning
- Authors: Jincen Jiang, Xuequan Lu, Lizhi Zhao, Richard Dazeley, Meili Wang
- Abstract summary: We propose masked Autoencoders in 3D point cloud representation learning (abbreviated as MAE3D)
We first split the input point cloud into patches and mask a portion of them, then use our Patch Embedding Module to extract the features of unmasked patches.
Comprehensive experiments demonstrate that the local features extracted by our MAE3D from point cloud patches are beneficial for downstream classification tasks.
- Score: 7.617783375837524
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformer-based Self-supervised Representation Learning methods learn
generic features from unlabeled datasets for providing useful network
initialization parameters for downstream tasks. Recently, self-supervised
learning based upon masking local surface patches for 3D point cloud data has
been under-explored. In this paper, we propose masked Autoencoders in 3D point
cloud representation learning (abbreviated as MAE3D), a novel autoencoding
paradigm for self-supervised learning. We first split the input point cloud
into patches and mask a portion of them, then use our Patch Embedding Module to
extract the features of unmasked patches. Secondly, we employ patch-wise MAE3D
Transformers to learn both local features of point cloud patches and high-level
contextual relationships between patches and complete the latent
representations of masked patches. We use our Point Cloud Reconstruction Module
with multi-task loss to complete the incomplete point cloud as a result. We
conduct self-supervised pre-training on ShapeNet55 with the point cloud
completion pre-text task and fine-tune the pre-trained model on ModelNet40 and
ScanObjectNN (PB\_T50\_RS, the hardest variant). Comprehensive experiments
demonstrate that the local features extracted by our MAE3D from point cloud
patches are beneficial for downstream classification tasks, soundly
outperforming state-of-the-art methods ($93.4\%$ and $86.2\%$ classification
accuracy, respectively).
Related papers
- PointRegGPT: Boosting 3D Point Cloud Registration using Generative Point-Cloud Pairs for Training [90.06520673092702]
We present PointRegGPT, boosting 3D point cloud registration using generative point-cloud pairs for training.
To our knowledge, this is the first generative approach that explores realistic data generation for indoor point cloud registration.
arXiv Detail & Related papers (2024-07-19T06:29:57Z) - Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis.
Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space.
Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z) - CPCM: Contextual Point Cloud Modeling for Weakly-supervised Point Cloud
Semantic Segmentation [60.0893353960514]
We study the task of weakly-supervised point cloud semantic segmentation with sparse annotations.
We propose a Contextual Point Cloud Modeling ( CPCM) method that consists of two parts: a region-wise masking (RegionMask) strategy and a contextual masked training (CMT) method.
arXiv Detail & Related papers (2023-07-19T04:41:18Z) - Self-supervised adversarial masking for 3D point cloud representation
learning [0.38233569758620056]
We introduce PointCAM, a novel adversarial method for learning a masking function for point clouds.
Compared to previous techniques, we postulate applying an auxiliary network that learns how to select masks instead of choosing them randomly.
Our results show that the learned masking function achieves state-of-the-art or competitive performance on various downstream tasks.
arXiv Detail & Related papers (2023-07-11T15:11:06Z) - EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder [60.52613206271329]
This paper introduces textbfEfficient textbfPoint textbfCloud textbfLearning (EPCL) for training high-quality point cloud models with a frozen CLIP transformer.
Our EPCL connects the 2D and 3D modalities by semantically aligning the image features and point cloud features without paired 2D-3D data.
arXiv Detail & Related papers (2022-12-08T06:27:11Z) - SeRP: Self-Supervised Representation Learning Using Perturbed Point
Clouds [6.29475963948119]
SeRP consists of encoder-decoder architecture that takes perturbed or corrupted point clouds as inputs.
We have used Transformers and PointNet-based Autoencoders.
arXiv Detail & Related papers (2022-09-13T15:22:36Z) - Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud
Pre-training [56.81809311892475]
Masked Autoencoders (MAE) have shown great potentials in self-supervised pre-training for language and 2D image transformers.
We propose Point-M2AE, a strong Multi-scale MAE pre-training framework for hierarchical self-supervised learning of 3D point clouds.
arXiv Detail & Related papers (2022-05-28T11:22:53Z) - Self-Supervised Point Cloud Representation Learning with Occlusion
Auto-Encoder [63.77257588569852]
We present 3D Occlusion Auto-Encoder (3D-OAE) for learning representations for point clouds.
Our key idea is to randomly occlude some local patches of the input point cloud and establish the supervision via recovering the occluded patches.
In contrast with previous methods, our 3D-OAE can remove a large proportion of patches and predict them only with a small number of visible patches.
arXiv Detail & Related papers (2022-03-26T14:06:29Z) - Masked Discrimination for Self-Supervised Learning on Point Clouds [27.652157544218234]
Masked autoencoding has achieved great success for self-supervised learning in the image and language domains.
Standard backbones like PointNet are unable to properly handle the training versus testing distribution mismatch introduced by masking during training.
We bridge this gap by proposing a discriminative mask pretraining Transformer framework, MaskPoint, for point clouds.
arXiv Detail & Related papers (2022-03-21T17:57:34Z) - Masked Autoencoders for Point Cloud Self-supervised Learning [27.894216954216716]
We propose a neat scheme of masked autoencoders for point cloud self-supervised learning.
We divide the input point cloud into irregular point patches and randomly mask them at a high ratio.
A standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches.
arXiv Detail & Related papers (2022-03-13T09:23:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.