Related papers: Self-positioning Point-based Transformer for Point Cloud Understanding

Self-positioning Point-based Transformer for Point Cloud Understanding

URL: http://arxiv.org/abs/2303.16450v1
Date: Wed, 29 Mar 2023 04:27:11 GMT
Title: Self-positioning Point-based Transformer for Point Cloud Understanding
Authors: Jinyoung Park, Sanghyeok Lee, Sihyeon Kim, Yunyang Xiong, Hyunwoo J. Kim
Abstract summary: Self-Positioning point-based Transformer (SPoTr) is designed to capture both local and global shape contexts with reduced complexity. SPoTr achieves an accuracy gain of 2.6% over the previous best models on shape classification with ScanObjectNN.
Score: 18.394318824968263
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformers have shown superior performance on various computer vision tasks with their capabilities to capture long-range dependencies. Despite the success, it is challenging to directly apply Transformers on point clouds due to their quadratic cost in the number of points. In this paper, we present a Self-Positioning point-based Transformer (SPoTr), which is designed to capture both local and global shape contexts with reduced complexity. Specifically, this architecture consists of local self-attention and self-positioning point-based global cross-attention. The self-positioning points, adaptively located based on the input shape, consider both spatial and semantic information with disentangled attention to improve expressive power. With the self-positioning points, we propose a novel global cross-attention mechanism for point clouds, which improves the scalability of global self-attention by allowing the attention module to compute attention weights with only a small set of self-positioning points. Experiments show the effectiveness of SPoTr on three point cloud tasks such as shape classification, part segmentation, and scene segmentation. In particular, our proposed model achieves an accuracy gain of 2.6% over the previous best models on shape classification with ScanObjectNN. We also provide qualitative analyses to demonstrate the interpretability of self-positioning points. The code of SPoTr is available at https://github.com/mlvlab/SPoTr.

Related papers

Fully-Geometric Cross-Attention for Point Cloud Registration [51.865371511201765]
Point cloud registration approaches often fail when the overlap between point clouds is low due to noisy point correspondences. This work introduces a novel cross-attention mechanism tailored for Transformer-based architectures that tackles this problem. We integrate the Gromov-Wasserstein distance into the cross-attention formulation to jointly compute distances between points across different point clouds. At the point level, we also devise a self-attention mechanism that aggregates the local geometric structure information into point features for fine matching.
arXiv Detail & Related papers (2025-02-12T10:44:36Z)
Point Tree Transformer for Point Cloud Registration [33.00645881490638]
Point cloud registration is a fundamental task in the fields of computer vision and robotics. We propose a novel transformer-based approach for point cloud registration that efficiently extracts comprehensive local and global features. Our method achieves superior performance over the state-of-the-art methods.
arXiv Detail & Related papers (2024-06-25T13:14:26Z)
Collect-and-Distribute Transformer for 3D Point Cloud Analysis [82.03517861433849]
We propose a new transformer network equipped with a collect-and-distribute mechanism to communicate short- and long-range contexts of point clouds. Results show the effectiveness of the proposed CDFormer, delivering several new state-of-the-art performances on point cloud classification and segmentation tasks.
arXiv Detail & Related papers (2023-06-02T03:48:45Z)
PointPatchMix: Point Cloud Mixing with Patch Scoring [58.58535918705736]
We propose PointPatchMix, which mixes point clouds at the patch level and generates content-based targets for mixed point clouds. Our approach preserves local features at the patch level, while the patch scoring module assigns targets based on the content-based significance score from a pre-trained teacher model. With Point-MAE as our baseline, our model surpasses previous methods by a significant margin, achieving 86.3% accuracy on ScanObjectNN and 94.1% accuracy on ModelNet40.
arXiv Detail & Related papers (2023-03-12T14:49:42Z)
Point Cloud Recognition with Position-to-Structure Attention Transformers [24.74805434602145]
Position-to-Structure Attention Transformers (PS-Former) is a Transformer-based algorithm for 3D point cloud recognition. PS-Former deals with the challenge in 3D point cloud representation where points are not positioned in a fixed grid structure. PS-Former demonstrates competitive experimental results on three 3D point cloud tasks including classification, part segmentation, and scene segmentation.
arXiv Detail & Related papers (2022-10-05T05:40:33Z)
CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation. We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration. The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z)
Point Set Self-Embedding [63.23565826873297]
This work presents an innovative method for point set self-embedding, that encodes structural information of a dense point set into its sparser version in a visual but imperceptible form. The self-embedded point set can function as the ordinary downsampled one and be visualized efficiently on mobile devices. We can leverage the self-embedded information to fully restore the original point set for detailed analysis on remote servers.
arXiv Detail & Related papers (2022-02-28T07:03:33Z)
PU-Transformer: Point Cloud Upsampling Transformer [38.05362492645094]
We focus on the point cloud upsampling task that intends to generate dense high-fidelity point clouds from sparse input data. Specifically, to activate the transformer's strong capability in representing features, we develop a new variant of a multi-head self-attention structure. We demonstrate the outstanding performance of our approach by comparing with the state-of-the-art CNN-based methods on different benchmarks.
arXiv Detail & Related papers (2021-11-24T03:25:35Z)
CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data. CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers. Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z)
3D Object Detection with Pointformer [29.935891419574602]
We propose Pointformer, a Transformer backbone designed for 3D point clouds to learn features effectively. A Local Transformer module is employed to model interactions among points in a local region, which learns context-dependent region features at an object level. A Global Transformer is designed to learn context-aware representations at the scene level.
arXiv Detail & Related papers (2020-12-21T15:12:54Z)
Point Transformer [122.2917213154675]
We investigate the application of self-attention networks to 3D point cloud processing. We design self-attention layers for point clouds and use these to construct self-attention networks for tasks such as semantic scene segmentation. Our Point Transformer design improves upon prior work across domains and tasks.
arXiv Detail & Related papers (2020-12-16T18:58:56Z)
SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data. We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface. We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.