Regress Before Construct: Regress Autoencoder for Point Cloud
Self-supervised Learning
- URL: http://arxiv.org/abs/2310.03670v1
- Date: Mon, 25 Sep 2023 17:23:33 GMT
- Title: Regress Before Construct: Regress Autoencoder for Point Cloud
Self-supervised Learning
- Authors: Yang Liu, Chen Chen, Can Wang, Xulin King, Mengyuan Liu
- Abstract summary: Masked Autoencoders (MAE) have demonstrated promising performance in self-supervised learning for 2D and 3D computer vision.
We propose Point Regress AutoEncoder (Point-RAE), a new scheme for regressive autoencoders for point cloud self-supervised learning.
Our approach is efficient during pre-training and generalizes well on various downstream tasks.
- Score: 18.10704604275133
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Masked Autoencoders (MAE) have demonstrated promising performance in
self-supervised learning for both 2D and 3D computer vision. Nevertheless,
existing MAE-based methods still have certain drawbacks. Firstly, the
functional decoupling between the encoder and decoder is incomplete, which
limits the encoder's representation learning ability. Secondly, downstream
tasks solely utilize the encoder, failing to fully leverage the knowledge
acquired through the encoder-decoder architecture in the pre-text task. In this
paper, we propose Point Regress AutoEncoder (Point-RAE), a new scheme for
regressive autoencoders for point cloud self-supervised learning. The proposed
method decouples functions between the decoder and the encoder by introducing a
mask regressor, which predicts the masked patch representation from the visible
patch representation encoded by the encoder and the decoder reconstructs the
target from the predicted masked patch representation. By doing so, we minimize
the impact of decoder updates on the representation space of the encoder.
Moreover, we introduce an alignment constraint to ensure that the
representations for masked patches, predicted from the encoded representations
of visible patches, are aligned with the masked patch presentations computed
from the encoder. To make full use of the knowledge learned in the pre-training
stage, we design a new finetune mode for the proposed Point-RAE. Extensive
experiments demonstrate that our approach is efficient during pre-training and
generalizes well on various downstream tasks. Specifically, our pre-trained
models achieve a high accuracy of \textbf{90.28\%} on the ScanObjectNN hardest
split and \textbf{94.1\%} accuracy on ModelNet40, surpassing all the other
self-supervised learning methods. Our code and pretrained model are public
available at: \url{https://github.com/liuyyy111/Point-RAE}.
Related papers
- PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders [57.31790812209751]
We show that when directly feeding the centers of masked patches to the decoder without information from the encoder, it still reconstructs well.
We propose a simple yet effective method, i.e., learning to Predict Centers for Point Masked AutoEncoders (PCP-MAE)
Our method is of high pre-training efficiency compared to other alternatives and achieves great improvement over Point-MAE.
arXiv Detail & Related papers (2024-08-16T13:53:53Z) - SeRP: Self-Supervised Representation Learning Using Perturbed Point
Clouds [6.29475963948119]
SeRP consists of encoder-decoder architecture that takes perturbed or corrupted point clouds as inputs.
We have used Transformers and PointNet-based Autoencoders.
arXiv Detail & Related papers (2022-09-13T15:22:36Z) - MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point
Cloud Action Recognition [160.49403075559158]
We propose a Masked Pseudo-Labeling autoEncoder (textbfMAPLE) framework for point cloud action recognition.
In particular, we design a novel and efficient textbfDecoupled textbfspatial-textbftemporal TranstextbfFormer (textbfDestFormer) as the backbone of MAPLE.
MAPLE achieves superior results on three public benchmarks and outperforms the state-of-the-art method by 8.08% accuracy on the MSR-Action3
arXiv Detail & Related papers (2022-09-01T12:32:40Z) - SdAE: Self-distillated Masked Autoencoder [95.3684955370897]
Self-distillated masked AutoEncoder network SdAE is proposed in this paper.
With only 300 epochs pre-training, a vanilla ViT-Base model achieves an 84.1% fine-tuning accuracy on ImageNet-1k classification.
arXiv Detail & Related papers (2022-07-31T15:07:25Z) - Improvements to Self-Supervised Representation Learning for Masked Image
Modeling [0.0]
This paper explores improvements to the masked image modeling (MIM) paradigm.
The MIM paradigm enables the model to learn the main object features of the image by masking the input image and predicting the masked part by the unmasked part.
We propose a new model, Contrastive Masked AutoEncoders (CMAE)
arXiv Detail & Related papers (2022-05-21T09:45:50Z) - Self-Supervised Point Cloud Representation Learning with Occlusion
Auto-Encoder [63.77257588569852]
We present 3D Occlusion Auto-Encoder (3D-OAE) for learning representations for point clouds.
Our key idea is to randomly occlude some local patches of the input point cloud and establish the supervision via recovering the occluded patches.
In contrast with previous methods, our 3D-OAE can remove a large proportion of patches and predict them only with a small number of visible patches.
arXiv Detail & Related papers (2022-03-26T14:06:29Z) - Context Autoencoder for Self-Supervised Representation Learning [64.63908944426224]
We pretrain an encoder by making predictions in the encoded representation space.
The network is an encoder-regressor-decoder architecture.
We demonstrate the effectiveness of our CAE through superior transfer performance in downstream tasks.
arXiv Detail & Related papers (2022-02-07T09:33:45Z) - Masked Autoencoders Are Scalable Vision Learners [60.97703494764904]
Masked autoencoders (MAE) are scalable self-supervised learners for computer vision.
Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.
Coupling these two designs enables us to train large models efficiently and effectively.
arXiv Detail & Related papers (2021-11-11T18:46:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.