Point Tree Transformer for Point Cloud Registration
- URL: http://arxiv.org/abs/2406.17530v1
- Date: Tue, 25 Jun 2024 13:14:26 GMT
- Title: Point Tree Transformer for Point Cloud Registration
- Authors: Meiling Wang, Guangyan Chen, Yi Yang, Li Yuan, Yufeng Yue,
- Abstract summary: Point cloud registration is a fundamental task in the fields of computer vision and robotics.
We propose a novel transformer-based approach for point cloud registration that efficiently extracts comprehensive local and global features.
Our method achieves superior performance over the state-of-the-art methods.
- Score: 33.00645881490638
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Point cloud registration is a fundamental task in the fields of computer vision and robotics. Recent developments in transformer-based methods have demonstrated enhanced performance in this domain. However, the standard attention mechanism utilized in these methods often integrates many low-relevance points, thereby struggling to prioritize its attention weights on sparse yet meaningful points. This inefficiency leads to limited local structure modeling capabilities and quadratic computational complexity. To overcome these limitations, we propose the Point Tree Transformer (PTT), a novel transformer-based approach for point cloud registration that efficiently extracts comprehensive local and global features while maintaining linear computational complexity. The PTT constructs hierarchical feature trees from point clouds in a coarse-to-dense manner, and introduces a novel Point Tree Attention (PTA) mechanism, which follows the tree structure to facilitate the progressive convergence of attended regions towards salient points. Specifically, each tree layer selectively identifies a subset of key points with the highest attention scores. Subsequent layers focus attention on areas of significant relevance, derived from the child points of the selected point set. The feature extraction process additionally incorporates coarse point features that capture high-level semantic information, thus facilitating local structure modeling and the progressive integration of multiscale information. Consequently, PTA empowers the model to concentrate on crucial local structures and derive detailed local information while maintaining linear computational complexity. Extensive experiments conducted on the 3DMatch, ModelNet40, and KITTI datasets demonstrate that our method achieves superior performance over the state-of-the-art methods.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - pCTFusion: Point Convolution-Transformer Fusion with Semantic Aware Loss
for Outdoor LiDAR Point Cloud Segmentation [8.24822602555667]
This study proposes a new architecture, pCTFusion, which combines kernel-based convolutions and self-attention mechanisms.
The proposed architecture employs two types of self-attention mechanisms, local and global, based on the hierarchical positions of the encoder blocks.
The results are particularly encouraging for minor classes, often misclassified due to class imbalance, lack of space, and neighbor-aware feature encoding.
arXiv Detail & Related papers (2023-07-27T11:12:48Z) - Self-positioning Point-based Transformer for Point Cloud Understanding [18.394318824968263]
Self-Positioning point-based Transformer (SPoTr) is designed to capture both local and global shape contexts with reduced complexity.
SPoTr achieves an accuracy gain of 2.6% over the previous best models on shape classification with ScanObjectNN.
arXiv Detail & Related papers (2023-03-29T04:27:11Z) - BoundED: Neural Boundary and Edge Detection in 3D Point Clouds via Local
Neighborhood Statistics [0.0]
We propose to utilize a novel set of features describing the local neighborhood on a per-point basis via first and second order statistics as input for a simple and compact classification network.
Leveraging this feature embedding enables our algorithm to outperform the state-of-the-art techniques in terms of quality and processing time.
arXiv Detail & Related papers (2022-10-24T14:49:03Z) - PointAttN: You Only Need Attention for Point Cloud Completion [89.88766317412052]
Point cloud completion refers to completing 3D shapes from partial 3D point clouds.
We propose a novel neural network for processing point cloud in a per-point manner to eliminate kNNs.
The proposed framework, namely PointAttN, is simple, neat and effective, which can precisely capture the structural information of 3D shapes.
arXiv Detail & Related papers (2022-03-16T09:20:01Z) - CpT: Convolutional Point Transformer for 3D Point Cloud Processing [10.389972581905]
We present CpT: Convolutional point Transformer - a novel deep learning architecture for dealing with the unstructured nature of 3D point cloud data.
CpT is an improvement over existing attention-based Convolutions Neural Networks as well as previous 3D point cloud processing transformers.
Our model can serve as an effective backbone for various point cloud processing tasks when compared to the existing state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-21T17:45:55Z) - Densely Nested Top-Down Flows for Salient Object Detection [137.74130900326833]
This paper revisits the role of top-down modeling in salient object detection.
It designs a novel densely nested top-down flows (DNTDF)-based framework.
In every stage of DNTDF, features from higher levels are read in via the progressive compression shortcut paths (PCSP)
arXiv Detail & Related papers (2021-02-18T03:14:02Z) - PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object
Detection [57.49788100647103]
LiDAR-based 3D object detection is an important task for autonomous driving.
Current approaches suffer from sparse and partial point clouds of distant and occluded objects.
In this paper, we propose a novel two-stage approach, namely PC-RGNN, dealing with such challenges by two specific solutions.
arXiv Detail & Related papers (2020-12-18T18:06:43Z) - SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine
Reconstruction with Self-Projection Optimization [52.20602782690776]
It is expensive and tedious to obtain large scale paired sparse-canned point sets for training from real scanned sparse data.
We propose a self-supervised point cloud upsampling network, named SPU-Net, to capture the inherent upsampling patterns of points lying on the underlying object surface.
We conduct various experiments on both synthetic and real-scanned datasets, and the results demonstrate that we achieve comparable performance to the state-of-the-art supervised methods.
arXiv Detail & Related papers (2020-12-08T14:14:09Z) - SoftPoolNet: Shape Descriptor for Point Cloud Completion and
Classification [93.54286830844134]
We propose a method for 3D object completion and classification based on point clouds.
For the decoder stage, we propose regional convolutions, a novel operator aimed at maximizing the global activation entropy.
We evaluate our approach on different 3D tasks such as object completion and classification, achieving state-of-the-art accuracy.
arXiv Detail & Related papers (2020-08-17T14:32:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.