Point Cloud Classification Using Content-based Transformer via
Clustering in Feature Space
- URL: http://arxiv.org/abs/2303.04599v1
- Date: Wed, 8 Mar 2023 14:11:05 GMT
- Title: Point Cloud Classification Using Content-based Transformer via
Clustering in Feature Space
- Authors: Yahui Liu, Bin Tian, Yisheng Lv, Lingxi Li, Feiyue Wang
- Abstract summary: We propose a point content-based Transformer architecture, called PointConT for short.
It exploits the locality of points in the feature space (content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class.
We also introduce an Inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately.
- Score: 25.57569871876213
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, there have been some attempts of Transformer in 3D point cloud
classification. In order to reduce computations, most existing methods focus on
local spatial attention, but ignore their content and fail to establish
relationships between distant but relevant points. To overcome the limitation
of local spatial attention, we propose a point content-based Transformer
architecture, called PointConT for short. It exploits the locality of points in
the feature space (content-based), which clusters the sampled points with
similar features into the same class and computes the self-attention within
each class, thus enabling an effective trade-off between capturing long-range
dependencies and computational complexity. We further introduce an Inception
feature aggregator for point cloud classification, which uses parallel
structures to aggregate high-frequency and low-frequency information in each
branch separately. Extensive experiments show that our PointConT model achieves
a remarkable performance on point cloud shape classification. Especially, our
method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectNN.
Source code of this paper is available at
https://github.com/yahuiliu99/PointConT.
Related papers
- ConDaFormer: Disassembled Transformer with Local Structure Enhancement
for 3D Point Cloud Understanding [105.98609765389895]
Transformers have been recently explored for 3D point cloud understanding.
A large number of points, over 0.1 million, make the global self-attention infeasible for point cloud data.
In this paper, we develop a new transformer block, named ConDaFormer.
arXiv Detail & Related papers (2023-12-18T11:19:45Z) - FreePoint: Unsupervised Point Cloud Instance Segmentation [72.64540130803687]
We propose FreePoint, for underexplored unsupervised class-agnostic instance segmentation on point clouds.
We represent point features by combining coordinates, colors, and self-supervised deep features.
Based on the point features, we segment point clouds into coarse instance masks as pseudo labels, which are used to train a point cloud instance segmentation model.
arXiv Detail & Related papers (2023-05-11T16:56:26Z) - Self-positioning Point-based Transformer for Point Cloud Understanding [18.394318824968263]
Self-Positioning point-based Transformer (SPoTr) is designed to capture both local and global shape contexts with reduced complexity.
SPoTr achieves an accuracy gain of 2.6% over the previous best models on shape classification with ScanObjectNN.
arXiv Detail & Related papers (2023-03-29T04:27:11Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - Stratified Transformer for 3D Point Cloud Segmentation [89.9698499437732]
Stratified Transformer is able to capture long-range contexts and demonstrates strong generalization ability and high performance.
To combat the challenges posed by irregular point arrangements, we propose first-layer point embedding to aggregate local information.
Experiments demonstrate the effectiveness and superiority of our method on S3DIS, ScanNetv2 and ShapeNetPart datasets.
arXiv Detail & Related papers (2022-03-28T05:35:16Z) - 3D Object Tracking with Transformer [6.848996369226086]
Feature fusion could make similarity computing more efficient by including target object information.
Most existing LiDAR-based approaches directly use the extracted point cloud feature to compute similarity.
In this paper, we propose a feature fusion network based on transformer architecture.
arXiv Detail & Related papers (2021-10-28T07:03:19Z) - Fast Point Voxel Convolution Neural Network with Selective Feature
Fusion for Point Cloud Semantic Segmentation [7.557684072809662]
We present a novel lightweight convolutional neural network for point cloud analysis.
Our method operates on the entire point sets without sampling and achieves good performances efficiently.
arXiv Detail & Related papers (2021-09-23T19:39:01Z) - Learning Semantic Segmentation of Large-Scale Point Clouds with Random
Sampling [52.464516118826765]
We introduce RandLA-Net, an efficient and lightweight neural architecture to infer per-point semantics for large-scale point clouds.
The key to our approach is to use random point sampling instead of more complex point selection approaches.
Our RandLA-Net can process 1 million points in a single pass up to 200x faster than existing approaches.
arXiv Detail & Related papers (2021-07-06T05:08:34Z) - SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation [71.2856098776959]
Estimating 3D motions for point clouds is challenging, since a point cloud is unordered and its density is significantly non-uniform.
We propose a novel architecture named Sparse Convolution-Transformer Network (SCTN) that equips the sparse convolution with the transformer.
We show that the learned relation-based contextual information is rich and helpful for matching corresponding points, benefiting scene flow estimation.
arXiv Detail & Related papers (2021-05-10T15:16:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.