Self-positioning Point-based Transformer for Point Cloud Understanding
- URL: http://arxiv.org/abs/2303.16450v1
- Date: Wed, 29 Mar 2023 04:27:11 GMT
- Title: Self-positioning Point-based Transformer for Point Cloud Understanding
- Authors: Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J.
Kim
- Abstract summary: Self-Positioning point-based Transformer (SPoTr) is designed to capture both local and global shape contexts with reduced complexity.
SPoTr achieves an accuracy gain of 2.6% over the previous best models on shape classification with ScanObjectNN.
- Score: 18.394318824968263
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformers have shown superior performance on various computer vision tasks
with their capabilities to capture long-range dependencies. Despite the
success, it is challenging to directly apply Transformers on point clouds due
to their quadratic cost in the number of points. In this paper, we present a
Self-Positioning point-based Transformer (SPoTr), which is designed to capture
both local and global shape contexts with reduced complexity. Specifically,
this architecture consists of local self-attention and self-positioning
point-based global cross-attention. The self-positioning points, adaptively
located based on the input shape, consider both spatial and semantic
information with disentangled attention to improve expressive power. With the
self-positioning points, we propose a novel global cross-attention mechanism
for point clouds, which improves the scalability of global self-attention by
allowing the attention module to compute attention weights with only a small
set of self-positioning points. Experiments show the effectiveness of SPoTr on
three point cloud tasks such as shape classification, part segmentation, and
scene segmentation. In particular, our proposed model achieves an accuracy gain
of 2.6% over the previous best models on shape classification with
ScanObjectNN. We also provide qualitative analyses to demonstrate the
interpretability of self-positioning points. The code of SPoTr is available at
https://github.com/mlvlab/SPoTr.
Related papers
- Point Tree Transformer for Point Cloud Registration [33.00645881490638]
Point cloud registration is a fundamental task in the fields of computer vision and robotics.
We propose a novel transformer-based approach for point cloud registration that efficiently extracts comprehensive local and global features.
Our method achieves superior performance over the state-of-the-art methods.
arXiv Detail & Related papers (2024-06-25T13:14:26Z) - Collect-and-Distribute Transformer for 3D Point Cloud Analysis [82.03517861433849]
We propose a new transformer network equipped with a collect-and-distribute mechanism to communicate short- and long-range contexts of point clouds.
Results show the effectiveness of the proposed CDFormer, delivering several new state-of-the-art performances on point cloud classification and segmentation tasks.
arXiv Detail & Related papers (2023-06-02T03:48:45Z) - PointPatchMix: Point Cloud Mixing with Patch Scoring [58.58535918705736]
We propose PointPatchMix, which mixes point clouds at the patch level and generates content-based targets for mixed point clouds.
Our approach preserves local features at the patch level, while the patch scoring module assigns targets based on the content-based significance score from a pre-trained teacher model.
With Point-MAE as our baseline, our model surpasses previous methods by a significant margin, achieving 86.3% accuracy on ScanObjectNN and 94.1% accuracy on ModelNet40.
arXiv Detail & Related papers (2023-03-12T14:49:42Z) - Point Cloud Recognition with Position-to-Structure Attention
Transformers [24.74805434602145]
Position-to-Structure Attention Transformers (PS-Former) is a Transformer-based algorithm for 3D point cloud recognition.
PS-Former deals with the challenge in 3D point cloud representation where points are not positioned in a fixed grid structure.
PS-Former demonstrates competitive experimental results on three 3D point cloud tasks including classification, part segmentation, and scene segmentation.
arXiv Detail & Related papers (2022-10-05T05:40:33Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Point Set Self-Embedding [63.23565826873297]
This work presents an innovative method for point set self-embedding, that encodes structural information of a dense point set into its sparser version in a visual but imperceptible form.
The self-embedded point set can function as the ordinary downsampled one and be visualized efficiently on mobile devices.
We can leverage the self-embedded information to fully restore the original point set for detailed analysis on remote servers.
arXiv Detail & Related papers (2022-02-28T07:03:33Z) - PU-Transformer: Point Cloud Upsampling Transformer [38.05362492645094]
We focus on the point cloud upsampling task that intends to generate dense high-fidelity point clouds from sparse input data.
Specifically, to activate the transformer's strong capability in representing features, we develop a new variant of a multi-head self-attention structure.
We demonstrate the outstanding performance of our approach by comparing with the state-of-the-art CNN-based methods on different benchmarks.
arXiv Detail & Related papers (2021-11-24T03:25:35Z) - CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data.
CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers.
Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z) - 3D Object Detection with Pointformer [29.935891419574602]
We propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively.
A Local Transformer module is employed to model interactions among points in a local region, which learns context-dependent region features at an object level.
A Global Transformer is designed to learn context-aware representations at the scene level.
arXiv Detail & Related papers (2020-12-21T15:12:54Z) - Point Transformer [122.2917213154675]
We investigate the application of self-attention networks to 3D point cloud processing.
We design self-attention layers for point clouds and use these to construct self-attention networks for tasks such as semantic scene segmentation.
Our Point Transformer design improves upon prior work across domains and tasks.
arXiv Detail & Related papers (2020-12-16T18:58:56Z) - SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine
Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data.
We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface.
We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.