PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
- URL: http://arxiv.org/abs/2408.08753v2
- Date: Thu, 24 Oct 2024 14:55:22 GMT
- Title: PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
- Authors: Xiangdong Zhang, Shaofeng Zhang, Junchi Yan,
- Abstract summary: We show that when directly feeding the centers of masked patches to the decoder without information from the encoder, it still reconstructs well.
We propose a simple yet effective method, i.e., learning to Predict Centers for Point Masked AutoEncoders (PCP-MAE)
Our method is of high pre-training efficiency compared to other alternatives and achieves great improvement over Point-MAE.
- Score: 57.31790812209751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Masked autoencoder has been widely explored in point cloud self-supervised learning, whereby the point cloud is generally divided into visible and masked parts. These methods typically include an encoder accepting visible patches (normalized) and corresponding patch centers (position) as input, with the decoder accepting the output of the encoder and the centers (position) of the masked parts to reconstruct each point in the masked patches. Then, the pre-trained encoders are used for downstream tasks. In this paper, we show a motivating empirical result that when directly feeding the centers of masked patches to the decoder without information from the encoder, it still reconstructs well. In other words, the centers of patches are important and the reconstruction objective does not necessarily rely on representations of the encoder, thus preventing the encoder from learning semantic representations. Based on this key observation, we propose a simple yet effective method, i.e., learning to Predict Centers for Point Masked AutoEncoders (PCP-MAE) which guides the model to learn to predict the significant centers and use the predicted centers to replace the directly provided centers. Specifically, we propose a Predicting Center Module (PCM) that shares parameters with the original encoder with extra cross-attention to predict centers. Our method is of high pre-training efficiency compared to other alternatives and achieves great improvement over Point-MAE, particularly surpassing it by 5.50% on OBJ-BG, 6.03% on OBJ-ONLY, and 5.17% on PB-T50-RS for 3D object classification on the ScanObjectNN dataset. The code is available at https://github.com/aHapBean/PCP-MAE.
Related papers
- Pre-training Point Cloud Compact Model with Partial-aware Reconstruction [51.403810709250024]
We present a pre-trained Point cloud Compact Model with Partial-aware textbfReconstruction, named Point-CPR.
Our model exhibits strong performance across various tasks, especially surpassing the leading MPM-based model PointGPT-B with only 2% of its parameters.
arXiv Detail & Related papers (2024-07-12T15:18:14Z) - Regress Before Construct: Regress Autoencoder for Point Cloud
Self-supervised Learning [18.10704604275133]
Masked Autoencoders (MAE) have demonstrated promising performance in self-supervised learning for 2D and 3D computer vision.
We propose Point Regress AutoEncoder (Point-RAE), a new scheme for regressive autoencoders for point cloud self-supervised learning.
Our approach is efficient during pre-training and generalizes well on various downstream tasks.
arXiv Detail & Related papers (2023-09-25T17:23:33Z) - SeRP: Self-Supervised Representation Learning Using Perturbed Point
Clouds [6.29475963948119]
SeRP consists of encoder-decoder architecture that takes perturbed or corrupted point clouds as inputs.
We have used Transformers and PointNet-based Autoencoders.
arXiv Detail & Related papers (2022-09-13T15:22:36Z) - MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point
Cloud Action Recognition [160.49403075559158]
We propose a Masked Pseudo-Labeling autoEncoder (textbfMAPLE) framework for point cloud action recognition.
In particular, we design a novel and efficient textbfDecoupled textbfspatial-textbftemporal TranstextbfFormer (textbfDestFormer) as the backbone of MAPLE.
MAPLE achieves superior results on three public benchmarks and outperforms the state-of-the-art method by 8.08% accuracy on the MSR-Action3
arXiv Detail & Related papers (2022-09-01T12:32:40Z) - Bootstrapped Masked Autoencoders for Vision BERT Pretraining [142.5285802605117]
BootMAE improves the original masked autoencoders (MAE) with two core designs.
1) momentum encoder that provides online feature as extra BERT prediction targets; 2) target-aware decoder that tries to reduce the pressure on the encoder to memorize target-specific information in BERT pretraining.
arXiv Detail & Related papers (2022-07-14T17:59:58Z) - Masked Autoencoders for Self-Supervised Learning on Automotive Point
Clouds [2.8544513613730205]
Masked autoencoding has become a successful pre-training paradigm for Transformer models for text, images, and recently, point clouds.
We propose VoxelMAE, a simple masked autoencoding pretraining scheme designed for voxel representations.
Our method improves the 3D OD performance by 1.75 mAP points and 1.05 NDS on the challenging nuScenes dataset.
arXiv Detail & Related papers (2022-07-01T16:31:45Z) - Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud
Pre-training [56.81809311892475]
Masked Autoencoders (MAE) have shown great potentials in self-supervised pre-training for language and 2D image transformers.
We propose Point-M2AE, a strong Multi-scale MAE pre-training framework for hierarchical self-supervised learning of 3D point clouds.
arXiv Detail & Related papers (2022-05-28T11:22:53Z) - Self-Supervised Point Cloud Representation Learning with Occlusion
Auto-Encoder [63.77257588569852]
We present 3D Occlusion Auto-Encoder (3D-OAE) for learning representations for point clouds.
Our key idea is to randomly occlude some local patches of the input point cloud and establish the supervision via recovering the occluded patches.
In contrast with previous methods, our 3D-OAE can remove a large proportion of patches and predict them only with a small number of visible patches.
arXiv Detail & Related papers (2022-03-26T14:06:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.