RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
- URL: http://arxiv.org/abs/2408.06110v1
- Date: Mon, 12 Aug 2024 12:47:37 GMT
- Title: RISurConv: Rotation Invariant Surface Attention-Augmented Convolutions for 3D Point Cloud Classification and Segmentation
- Authors: Zhiyuan Zhang, Licheng Yang, Zhiyu Xiang,
- Abstract summary: We propose a novel yet effective rotation invariant architecture for 3D point cloud classification and segmentation.
We build an effective neural network for 3D point cloud analysis that is invariant to arbitrary rotations while maintaining high accuracy.
- Score: 17.558376773179337
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the progress on 3D point cloud deep learning, most prior works focus on learning features that are invariant to translation and point permutation, and very limited efforts have been devoted for rotation invariant property. Several recent studies achieve rotation invariance at the cost of lower accuracies. In this work, we close this gap by proposing a novel yet effective rotation invariant architecture for 3D point cloud classification and segmentation. Instead of traditional pointwise operations, we construct local triangle surfaces to capture more detailed surface structure, based on which we can extract highly expressive rotation invariant surface properties which are then integrated into an attention-augmented convolution operator named RISurConv to generate refined attention features via self-attention layers. Based on RISurConv we build an effective neural network for 3D point cloud analysis that is invariant to arbitrary rotations while maintaining high accuracy. We verify the performance on various benchmarks with supreme results obtained surpassing the previous state-of-the-art by a large margin. We achieve an overall accuracy of 96.0% (+4.7%) on ModelNet40, 93.1% (+12.8%) on ScanObjectNN, and class accuracies of 91.5% (+3.6%), 82.7% (+5.1%), and 78.5% (+9.2%) on the three categories of the FG3D dataset for the fine-grained classification task. Additionally, we achieve 81.5% (+1.0%) mIoU on ShapeNet for the segmentation task. Code is available here: https://github.com/cszyzhang/RISurConv
Related papers
- ShapeSplat: A Large-scale Dataset of Gaussian Splats and Their Self-Supervised Pretraining [104.34751911174196]
We build a large-scale dataset of 3DGS using ShapeNet and ModelNet datasets.
Our dataset ShapeSplat consists of 65K objects from 87 unique categories.
We introduce textbftextitGaussian-MAE, which highlights the unique benefits of representation learning from Gaussian parameters.
arXiv Detail & Related papers (2024-08-20T14:49:14Z) - E$^3$-Net: Efficient E(3)-Equivariant Normal Estimation Network [47.77270862087191]
We propose E3-Net to achieve equivariance for normal estimation.
We introduce an efficient random frame method, which significantly reduces the training resources required for this task to just 1/8 of previous work.
Our method achieves superior results on both synthetic and real-world datasets, and outperforms current state-of-the-art techniques by a substantial margin.
arXiv Detail & Related papers (2024-06-01T07:53:36Z) - Clustering based Point Cloud Representation Learning for 3D Analysis [80.88995099442374]
We propose a clustering based supervised learning scheme for point cloud analysis.
Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space.
Our algorithm shows notable improvements on famous point cloud segmentation datasets.
arXiv Detail & Related papers (2023-07-27T03:42:12Z) - HS-Pose: Hybrid Scope Feature Extraction for Category-level Object Pose
Estimation [28.405005252559146]
We propose a simple network structure, the HS-layer, which extends 3D-GC to extract hybrid scope latent features from point cloud data.
The proposed HS-layer: 1) is able to perceive local-global geometric structure and global information, 2) is robust to noise, and 3) can encode size and translation information.
Our experiments show that the simple replacement of the 3D-GC layer with the proposed HS-layer on the baseline method (GPV-Pose) achieves a significant improvement.
arXiv Detail & Related papers (2023-03-28T05:36:42Z) - Window Normalization: Enhancing Point Cloud Understanding by Unifying
Inconsistent Point Densities [16.770190781915673]
Downsampling and feature extraction are essential procedures for 3D point cloud understanding.
Window-normalization method is leveraged to unify the point densities in different parts.
Group-wise strategy is proposed to obtain multi-type features, including texture and spatial information.
arXiv Detail & Related papers (2022-12-05T14:09:07Z) - CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds [55.44204039410225]
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Our proposed method first generates some high-quality 3D proposals by leveraging the class-aware local group strategy on the object surface voxels.
To recover the features of missed voxels due to incorrect voxel-wise segmentation, we build a fully sparse convolutional RoI pooling module.
arXiv Detail & Related papers (2022-10-09T13:38:48Z) - CloudAttention: Efficient Multi-Scale Attention Scheme For 3D Point
Cloud Learning [81.85951026033787]
We set transformers in this work and incorporate them into a hierarchical framework for shape classification and part and scene segmentation.
We also compute efficient and dynamic global cross attentions by leveraging sampling and grouping at each iteration.
The proposed hierarchical model achieves state-of-the-art shape classification in mean accuracy and yields results on par with the previous segmentation methods.
arXiv Detail & Related papers (2022-07-31T21:39:15Z) - RIConv++: Effective Rotation Invariant Convolutions for 3D Point Clouds
Deep Learning [32.18566879365623]
3D point clouds deep learning is a promising field of research that allows a neural network to learn features of point clouds directly.
We propose a simple yet effective convolution operator that enhances feature distinction by designing powerful rotation invariant features from the local regions.
Our network architecture can capture both local and global context by simply tuning the neighborhood size in each convolution layer.
arXiv Detail & Related papers (2022-02-26T08:32:44Z) - FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose
Estimation with Decoupled Rotation Mechanism [49.89268018642999]
We propose a fast shape-based network (FS-Net) with efficient category-level feature extraction for 6D pose estimation.
The proposed method achieves state-of-the-art performance in both category- and instance-level 6D object pose estimation.
arXiv Detail & Related papers (2021-03-12T03:07:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.