Related papers: Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling

Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling

URL: http://arxiv.org/abs/2111.14819v1
Date: Mon, 29 Nov 2021 18:59:03 GMT
Title: Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling
Authors: Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie Zhou, Jiwen Lu
Abstract summary: We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to 3D point cloud. Experiments demonstrate that the proposed BERT-style pre-training strategy significantly improves the performance of standard point cloud Transformers.
Score: 104.82953953453503
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to 3D point cloud. Inspired by BERT, we devise a Masked Point Modeling (MPM) task to pre-train point cloud Transformers. Specifically, we first divide a point cloud into several local point patches, and a point cloud Tokenizer with a discrete Variational AutoEncoder (dVAE) is designed to generate discrete point tokens containing meaningful local information. Then, we randomly mask out some patches of input point clouds and feed them into the backbone Transformers. The pre-training objective is to recover the original point tokens at the masked locations under the supervision of point tokens obtained by the Tokenizer. Extensive experiments demonstrate that the proposed BERT-style pre-training strategy significantly improves the performance of standard point cloud Transformers. Equipped with our pre-training strategy, we show that a pure Transformer architecture attains 93.8% accuracy on ModelNet40 and 83.1% accuracy on the hardest setting of ScanObjectNN, surpassing carefully designed point cloud models with much fewer hand-made designs. We also demonstrate that the representations learned by Point-BERT transfer well to new tasks and domains, where our models largely advance the state-of-the-art of few-shot point cloud classification task. The code and pre-trained models are available at https://github.com/lulutang0608/Point-BERT

Related papers

3D Point Cloud Generation via Autoregressive Up-sampling [60.05226063558296]
We introduce a pioneering autoregressive generative model for 3D point cloud generation. Inspired by visual autoregressive modeling, we conceptualize point cloud generation as an autoregressive up-sampling process. PointARU progressively refines 3D point clouds from coarse to fine scales.
arXiv Detail & Related papers (2025-03-11T16:30:45Z)
Pre-training Point Cloud Compact Model with Partial-aware Reconstruction [51.403810709250024]
We present a pre-trained Point cloud Compact Model with Partial-aware textbfReconstruction, named Point-CPR. Our model exhibits strong performance across various tasks, especially surpassing the leading MPM-based model PointGPT-B with only 2% of its parameters.
arXiv Detail & Related papers (2024-07-12T15:18:14Z)
Adaptive Point Transformer [88.28498667506165]
Adaptive Point Cloud Transformer (AdaPT) is a standard PT model augmented by an adaptive token selection mechanism. AdaPT dynamically reduces the number of tokens during inference, enabling efficient processing of large point clouds.
arXiv Detail & Related papers (2024-01-26T13:24:45Z)
PointGPT: Auto-regressively Generative Pre-training from Point Clouds [45.488532108226565]
We present PointGPT, a novel approach that extends the concept of GPT to point clouds. Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models. Our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models.
arXiv Detail & Related papers (2023-05-19T07:39:04Z)
PointPatchMix: Point Cloud Mixing with Patch Scoring [58.58535918705736]
We propose PointPatchMix, which mixes point clouds at the patch level and generates content-based targets for mixed point clouds. Our approach preserves local features at the patch level, while the patch scoring module assigns targets based on the content-based significance score from a pre-trained teacher model. With Point-MAE as our baseline, our model surpasses previous methods by a significant margin, achieving 86.3% accuracy on ScanObjectNN and 94.1% accuracy on ModelNet40.
arXiv Detail & Related papers (2023-03-12T14:49:42Z)
AdaPoinTr: Diverse Point Cloud Completion with Adaptive Geometry-Aware Transformers [94.11915008006483]
We present a new method that reformulates point cloud completion as a set-to-set translation problem. We design a new model, called PoinTr, which adopts a Transformer encoder-decoder architecture for point cloud completion. Our method attains 6.53 CD on PCN, 0.81 CD on ShapeNet-55 and 0.392 MMD on real-world KITTI.
arXiv Detail & Related papers (2023-01-11T16:14:12Z)
Point-McBert: A Multi-choice Self-supervised Framework for Point Cloud Pre-training [6.037383467521294]
We propose Point-McBert, a pre-training framework with eased and refined supervision signals. Specifically, we ease the previous single-choice constraint on patches, and provide multi-choice token ids for each patch as supervision. Our method achieves 94.1% accuracy on ModelNet40, 84.28% accuracy on the hardest setting of ScanObjectNN and new state-of-the-art performance on few-shot learning.
arXiv Detail & Related papers (2022-07-27T00:34:33Z)
POS-BERT: Point Cloud One-Stage BERT Pre-Training [34.30767607646814]
We propose POS-BERT, a one-stage BERT pre-training method for point clouds. Unlike Point-BERT, its tokenizer is extra-trained and frozen. POS-BERT achieves the state-of-the-art classification accuracy, which exceeds Point-BERT by 3.5%.
arXiv Detail & Related papers (2022-04-03T04:49:39Z)
Masked Autoencoders for Point Cloud Self-supervised Learning [27.894216954216716]
We propose a neat scheme of masked autoencoders for point cloud self-supervised learning. We divide the input point cloud into irregular point patches and randomly mask them at a high ratio. A standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches.
arXiv Detail & Related papers (2022-03-13T09:23:39Z)
PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers [81.71904691925428]
We present a new method that reformulates point cloud completion as a set-to-set translation problem. We also design a new model, called PoinTr, that adopts a transformer encoder-decoder architecture for point cloud completion. Our method outperforms state-of-the-art methods by a large margin on both the new benchmarks and the existing ones.
arXiv Detail & Related papers (2021-08-19T17:58:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.