POS-BERT: Point Cloud One-Stage BERT Pre-Training
- URL: http://arxiv.org/abs/2204.00989v1
- Date: Sun, 3 Apr 2022 04:49:39 GMT
- Title: POS-BERT: Point Cloud One-Stage BERT Pre-Training
- Authors: Kexue Fu, Peng Gao, ShaoLei Liu, Renrui Zhang, Yu Qiao, Manning Wang
- Abstract summary: We propose POS-BERT, a one-stage BERT pre-training method for point clouds.
Unlike Point-BERT, its tokenizer is extra-trained and frozen.
POS-BERT achieves the state-of-the-art classification accuracy, which exceeds Point-BERT by 3.5%.
- Score: 34.30767607646814
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the pre-training paradigm combining Transformer and masked language
modeling has achieved tremendous success in NLP, images, and point clouds, such
as BERT. However, directly extending BERT from NLP to point clouds requires
training a fixed discrete Variational AutoEncoder (dVAE) before pre-training,
which results in a complex two-stage method called Point-BERT. Inspired by BERT
and MoCo, we propose POS-BERT, a one-stage BERT pre-training method for point
clouds. Specifically, we use the mask patch modeling (MPM) task to perform
point cloud pre-training, which aims to recover masked patches information
under the supervision of the corresponding tokenizer output. Unlike Point-BERT,
its tokenizer is extra-trained and frozen. We propose to use the dynamically
updated momentum encoder as the tokenizer, which is updated and outputs the
dynamic supervision signal along with the training process. Further, in order
to learn high-level semantic representation, we combine contrastive learning to
maximize the class token consistency between different transformation point
clouds. Extensive experiments have demonstrated that POS-BERT can extract
high-quality pre-training features and promote downstream tasks to improve
performance. Using the pre-training model without any fine-tuning to extract
features and train linear SVM on ModelNet40, POS-BERT achieves the
state-of-the-art classification accuracy, which exceeds Point-BERT by 3.5\%. In
addition, our approach has significantly improved many downstream tasks, such
as fine-tuned classification, few-shot classification, part segmentation. The
code and trained-models will be available at:
\url{https://github.com/fukexue/POS-BERT}.
Related papers
- Unsupervised Pre-training with Language-Vision Prompts for Low-Data Instance Segmentation [105.23631749213729]
We propose a novel method for unsupervised pre-training in low-data regimes.
Inspired by the recently successful prompting technique, we introduce a new method, Unsupervised Pre-training with Language-Vision Prompts.
We show that our method can converge faster and perform better than CNN-based models in low-data regimes.
arXiv Detail & Related papers (2024-05-22T06:48:43Z) - Point Cloud Pre-training with Diffusion Models [62.12279263217138]
We propose a novel pre-training method called Point cloud Diffusion pre-training (PointDif)
PointDif achieves substantial improvement across various real-world datasets for diverse downstream tasks such as classification, segmentation and detection.
arXiv Detail & Related papers (2023-11-25T08:10:05Z) - PointGPT: Auto-regressively Generative Pre-training from Point Clouds [45.488532108226565]
We present PointGPT, a novel approach that extends the concept of GPT to point clouds.
Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models.
Our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models.
arXiv Detail & Related papers (2023-05-19T07:39:04Z) - Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models [64.49254199311137]
We propose a novel Instance-aware Dynamic Prompt Tuning (IDPT) strategy for pre-trained point cloud models.
The essence of IDPT is to develop a dynamic prompt generation module to perceive semantic prior features of each point cloud instance.
In experiments, IDPT outperforms full fine-tuning in most tasks with a mere 7% of the trainable parameters.
arXiv Detail & Related papers (2023-04-14T16:03:09Z) - Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - EPCL: Frozen CLIP Transformer is An Efficient Point Cloud Encoder [60.52613206271329]
This paper introduces textbfEfficient textbfPoint textbfCloud textbfLearning (EPCL) for training high-quality point cloud models with a frozen CLIP transformer.
Our EPCL connects the 2D and 3D modalities by semantically aligning the image features and point cloud features without paired 2D-3D data.
arXiv Detail & Related papers (2022-12-08T06:27:11Z) - Point-McBert: A Multi-choice Self-supervised Framework for Point Cloud
Pre-training [6.037383467521294]
We propose Point-McBert, a pre-training framework with eased and refined supervision signals.
Specifically, we ease the previous single-choice constraint on patches, and provide multi-choice token ids for each patch as supervision.
Our method achieves 94.1% accuracy on ModelNet40, 84.28% accuracy on the hardest setting of ScanObjectNN and new state-of-the-art performance on few-shot learning.
arXiv Detail & Related papers (2022-07-27T00:34:33Z) - Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
Modeling [104.82953953453503]
We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to 3D point cloud.
Experiments demonstrate that the proposed BERT-style pre-training strategy significantly improves the performance of standard point cloud Transformers.
arXiv Detail & Related papers (2021-11-29T18:59:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.