Point Cloud Recognition with Position-to-Structure Attention
Transformers
- URL: http://arxiv.org/abs/2210.02030v1
- Date: Wed, 5 Oct 2022 05:40:33 GMT
- Title: Point Cloud Recognition with Position-to-Structure Attention
Transformers
- Authors: Zheng Ding, James Hou, Zhuowen Tu
- Abstract summary: Position-to-Structure Attention Transformers (PS-Former) is a Transformer-based algorithm for 3D point cloud recognition.
PS-Former deals with the challenge in 3D point cloud representation where points are not positioned in a fixed grid structure.
PS-Former demonstrates competitive experimental results on three 3D point cloud tasks including classification, part segmentation, and scene segmentation.
- Score: 24.74805434602145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present Position-to-Structure Attention Transformers
(PS-Former), a Transformer-based algorithm for 3D point cloud recognition.
PS-Former deals with the challenge in 3D point cloud representation where
points are not positioned in a fixed grid structure and have limited feature
description (only 3D coordinates ($x, y, z$) for scattered points). Existing
Transformer-based architectures in this domain often require a pre-specified
feature engineering step to extract point features. Here, we introduce two new
aspects in PS-Former: 1) a learnable condensation layer that performs point
downsampling and feature extraction; and 2) a Position-to-Structure Attention
mechanism that recursively enriches the structural information with the
position attention branch. Compared with the competing methods, while being
generic with less heuristics feature designs, PS-Former demonstrates
competitive experimental results on three 3D point cloud tasks including
classification, part segmentation, and scene segmentation.
Related papers
- ConDaFormer: Disassembled Transformer with Local Structure Enhancement
for 3D Point Cloud Understanding [105.98609765389895]
Transformers have been recently explored for 3D point cloud understanding.
A large number of points, over 0.1 million, make the global self-attention infeasible for point cloud data.
In this paper, we develop a new transformer block, named ConDaFormer.
arXiv Detail & Related papers (2023-12-18T11:19:45Z) - Self-positioning Point-based Transformer for Point Cloud Understanding [18.394318824968263]
Self-Positioning point-based Transformer (SPoTr) is designed to capture both local and global shape contexts with reduced complexity.
SPoTr achieves an accuracy gain of 2.6% over the previous best models on shape classification with ScanObjectNN.
arXiv Detail & Related papers (2023-03-29T04:27:11Z) - Position-Guided Point Cloud Panoptic Segmentation Transformer [118.17651196656178]
This work begins by applying this appealing paradigm to LiDAR-based point cloud segmentation and obtains a simple yet effective baseline.
We observe that instances in the sparse point clouds are relatively small to the whole scene and often have similar geometry but lack distinctive appearance for segmentation, which are rare in the image domain.
The method, named Position-guided Point cloud Panoptic segmentation transFormer (P3Former), outperforms previous state-of-the-art methods by 3.4% and 1.2% on Semantic KITTI and nuScenes benchmark, respectively.
arXiv Detail & Related papers (2023-03-23T17:59:02Z) - Flattening-Net: Deep Regular 2D Representation for 3D Point Cloud
Analysis [66.49788145564004]
We present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology.
Our methods perform favorably against the current state-of-the-art competitors.
arXiv Detail & Related papers (2022-12-17T15:05:25Z) - Learning a Task-specific Descriptor for Robust Matching of 3D Point
Clouds [40.81429160296275]
We learn a robust task-specific feature descriptor to consistently describe the correct point correspondence under interference.
Our method EDFNet develops from two aspects. First, we augment the matchability of correspondences by utilizing their repetitive local structure.
arXiv Detail & Related papers (2022-10-26T17:57:23Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - SeedFormer: Patch Seeds based Point Cloud Completion with Upsample
Transformer [46.800630776714016]
We propose a novel SeedFormer to improve the ability of detail preservation and recovery in point cloud completion.
We introduce a new shape representation, namely Patch Seeds, which not only captures general structures from partial inputs but also preserves regional information of local patterns.
Our method outperforms state-of-the-art completion networks on several benchmark datasets.
arXiv Detail & Related papers (2022-07-21T06:15:59Z) - SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation [94.11915008006483]
We propose SemAffiNet for point cloud semantic segmentation.
We conduct extensive experiments on the ScanNetV2 and NYUv2 datasets.
arXiv Detail & Related papers (2022-05-26T17:00:23Z) - Learning Local Displacements for Point Cloud Completion [93.54286830844134]
We propose a novel approach aimed at object and semantic scene completion from a partial scan represented as a 3D point cloud.
Our architecture relies on three novel layers that are used successively within an encoder-decoder structure.
We evaluate both architectures on object and indoor scene completion tasks, achieving state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T18:31:37Z) - CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data.
CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers.
Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.