General Point Model with Autoencoding and Autoregressive
- URL: http://arxiv.org/abs/2310.16861v1
- Date: Wed, 25 Oct 2023 06:08:24 GMT
- Title: General Point Model with Autoencoding and Autoregressive
- Authors: Zhe Li and Zhangyang Gao and Cheng Tan and Stan Z. Li and Laurence T.
Yang
- Abstract summary: We propose a General Point Model which seamlessly integrates autoencoding and autoregressive tasks in point cloud transformer.
This model is versatile, allowing fine-tuning for downstream point cloud representation tasks, as well as unconditional and conditional generation tasks.
- Score: 55.051626723729896
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The pre-training architectures of large language models encompass various
types, including autoencoding models, autoregressive models, and
encoder-decoder models. We posit that any modality can potentially benefit from
a large language model, as long as it undergoes vector quantization to become
discrete tokens. Inspired by GLM, we propose a General Point Model (GPM) which
seamlessly integrates autoencoding and autoregressive tasks in point cloud
transformer. This model is versatile, allowing fine-tuning for downstream point
cloud representation tasks, as well as unconditional and conditional generation
tasks. GPM enhances masked prediction in autoencoding through various forms of
mask padding tasks, leading to improved performance in point cloud
understanding. Additionally, GPM demonstrates highly competitive results in
unconditional point cloud generation tasks, even exhibiting the potential for
conditional generation tasks by modifying the input's conditional information.
Compared to models like Point-BERT, MaskPoint and PointMAE, our GPM achieves
superior performance in point cloud understanding tasks. Furthermore, the
integration of autoregressive and autoencoding within the same transformer
underscores its versatility across different downstream tasks.
Related papers
- Pre-training Point Cloud Compact Model with Partial-aware Reconstruction [51.403810709250024]
We present a pre-trained Point cloud Compact Model with Partial-aware textbfReconstruction, named Point-CPR.
Our model exhibits strong performance across various tasks, especially surpassing the leading MPM-based model PointGPT-B with only 2% of its parameters.
arXiv Detail & Related papers (2024-07-12T15:18:14Z) - Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning [116.75939193785143]
Contrastive learning (CL) for Vision Transformers (ViTs) in image domains has achieved performance comparable to CL for traditional convolutional backbones.
In 3D point cloud pretraining with ViTs, masked autoencoder (MAE) modeling remains dominant.
arXiv Detail & Related papers (2024-07-08T12:28:56Z) - M$^3$CS: Multi-Target Masked Point Modeling with Learnable Codebook and
Siamese Decoders [19.68592678093725]
Masked point modeling has become a promising scheme of self-supervised pre-training for point clouds.
M$3$CS is proposed to enable the model with the above abilities.
arXiv Detail & Related papers (2023-09-23T02:19:21Z) - PointGPT: Auto-regressively Generative Pre-training from Point Clouds [45.488532108226565]
We present PointGPT, a novel approach that extends the concept of GPT to point clouds.
Specifically, a point cloud auto-regressive generation task is proposed to pre-train transformer models.
Our approach achieves classification accuracies of 94.9% on the ModelNet40 dataset and 93.4% on the ScanObjectNN dataset, outperforming all other transformer models.
arXiv Detail & Related papers (2023-05-19T07:39:04Z) - Masked Autoencoders for Point Cloud Self-supervised Learning [27.894216954216716]
We propose a neat scheme of masked autoencoders for point cloud self-supervised learning.
We divide the input point cloud into irregular point patches and randomly mask them at a high ratio.
A standard Transformer based autoencoder, with an asymmetric design and a shifting mask tokens operation, learns high-level latent features from unmasked point patches.
arXiv Detail & Related papers (2022-03-13T09:23:39Z) - Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point
Modeling [104.82953953453503]
We present Point-BERT, a new paradigm for learning Transformers to generalize the concept of BERT to 3D point cloud.
Experiments demonstrate that the proposed BERT-style pre-training strategy significantly improves the performance of standard point cloud Transformers.
arXiv Detail & Related papers (2021-11-29T18:59:03Z) - Deep Autoencoding Topic Model with Scalable Hybrid Bayesian Inference [55.35176938713946]
We develop deep autoencoding topic model (DATM) that uses a hierarchy of gamma distributions to construct its multi-stochastic-layer generative network.
We propose a Weibull upward-downward variational encoder that deterministically propagates information upward via a deep neural network, followed by a downward generative model.
The efficacy and scalability of our models are demonstrated on both unsupervised and supervised learning tasks on big corpora.
arXiv Detail & Related papers (2020-06-15T22:22:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.