Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
Modeling
- URL: http://arxiv.org/abs/2111.14819v1
- Date: Mon, 29 Nov 2021 18:59:03 GMT
- Title: Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
Modeling
- Authors: Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie Zhou, Jiwen Lu
- Abstract summary: We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to 3D point cloud.
Experiments demonstrate that the proposed BERT-style pre-training strategy significantly improves the performance of standard point cloud Transformers.
- Score: 104.82953953453503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Point-BERT, a new paradigm for learning Transformers to generalize
the concept of BERT to 3D point cloud. Inspired by BERT, we devise a Masked
Point Modeling (MPM) task to pre-train point cloud Transformers. Specifically,
we first divide a point cloud into several local point patches, and a point
cloud Tokenizer with a discrete Variational AutoEncoder (dVAE) is designed to
generate discrete point tokens containing meaningful local information. Then,
we randomly mask out some patches of input point clouds and feed them into the
backbone Transformers. The pre-training objective is to recover the original
point tokens at the masked locations under the supervision of point tokens
obtained by the Tokenizer. Extensive experiments demonstrate that the proposed
BERT-style pre-training strategy significantly improves the performance of
standard point cloud Transformers. Equipped with our pre-training strategy, we
show that a pure Transformer architecture attains 93.8% accuracy on ModelNet40
and 83.1% accuracy on the hardest setting of ScanObjectNN, surpassing carefully
designed point cloud models with much fewer hand-made designs. We also
demonstrate that the representations learned by Point-BERT transfer well to new
tasks and domains, where our models largely advance the state-of-the-art of
few-shot point cloud classification task. The code and pre-trained models are
available at https://github.com/lulutang0608/Point-BERT
Related papers
- Pre-training Point Cloud Compact Model with Partial-aware Reconstruction [51.403810709250024]
We present a pre-trained Point cloud Compact Model with Partial-aware textbfReconstruction, named Point-CPR.
Our model exhibits strong performance across various tasks, especially surpassing the leading MPM-based model PointGPT-B with only 2% of its parameters.
arXiv Detail & Related papers (2024-07-12T15:18:14Z) - Adaptive Point Transformer [88.28498667506165]
Adaptive Point Cloud Transformer (AdaPT) is a standard PT model augmented by an adaptive token selection mechanism.
AdaPT dynamically reduces the number of tokens during inference, enabling efficient processing of large point clouds.
arXiv Detail & Related papers (2024-01-26T13:24:45Z) - PointGPT: Auto-regressively Generative Pre-training from Point Clouds [45.488532108226565]
We present PointGPT, a novel approach that extends the concept of GPT to point clouds.
Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models.
Our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models.
arXiv Detail & Related papers (2023-05-19T07:39:04Z) - PointPatchMix: Point Cloud Mixing with Patch Scoring [58.58535918705736]
We propose PointPatchMix, which mixes point clouds at the patch level and generates content-based targets for mixed point clouds.
Our approach preserves local features at the patch level, while the patch scoring module assigns targets based on the content-based significance score from a pre-trained teacher model.
With Point-MAE as our baseline, our model surpasses previous methods by a significant margin, achieving 86.3% accuracy on ScanObjectNN and 94.1% accuracy on ModelNet40.
arXiv Detail & Related papers (2023-03-12T14:49:42Z) - AdaPoinTr: Diverse Point Cloud Completion with Adaptive Geometry-Aware
Transformers [94.11915008006483]
We present a new method that reformulates point cloud completion as a set-to-set translation problem.
We design a new model, called PoinTr, which adopts a Transformer encoder-decoder architecture for point cloud completion.
Our method attains 6.53 CD on PCN, 0.81 CD on ShapeNet-55 and 0.392 MMD on real-world KITTI.
arXiv Detail & Related papers (2023-01-11T16:14:12Z) - Point-McBert: A Multi-choice Self-supervised Framework for Point Cloud
Pre-training [6.037383467521294]
We propose Point-McBert, a pre-training framework with eased and refined supervision signals.
Specifically, we ease the previous single-choice constraint on patches, and provide multi-choice token ids for each patch as supervision.
Our method achieves 94.1% accuracy on ModelNet40, 84.28% accuracy on the hardest setting of ScanObjectNN and new state-of-the-art performance on few-shot learning.
arXiv Detail & Related papers (2022-07-27T00:34:33Z) - POS-BERT: Point Cloud One-Stage BERT Pre-Training [34.30767607646814]
We propose POS-BERT, a one-stage BERT pre-training method for point clouds.
Unlike Point-BERT, its tokenizer is extra-trained and frozen.
POS-BERT achieves the state-of-the-art classification accuracy, which exceeds Point-BERT by 3.5%.
arXiv Detail & Related papers (2022-04-03T04:49:39Z) - Masked Autoencoders for Point Cloud Self-supervised Learning [27.894216954216716]
We propose a neat scheme of masked autoencoders for point cloud self-supervised learning.
We divide the input point cloud into irregular point patches and randomly mask them at a high ratio.
A standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches.
arXiv Detail & Related papers (2022-03-13T09:23:39Z) - PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers [81.71904691925428]
We present a new method that reformulates point cloud completion as a set-to-set translation problem.
We also design a new model, called PoinTr, that adopts a transformer encoder-decoder architecture for point cloud completion.
Our method outperforms state-of-the-art methods by a large margin on both the new benchmarks and the existing ones.
arXiv Detail & Related papers (2021-08-19T17:58:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.