HFBRI-MAE: Handcrafted Feature Based Rotation-Invariant Masked Autoencoder for 3D Point Cloud Analysis
- URL: http://arxiv.org/abs/2504.14132v1
- Date: Sat, 19 Apr 2025 01:33:19 GMT
- Title: HFBRI-MAE: Handcrafted Feature Based Rotation-Invariant Masked Autoencoder for 3D Point Cloud Analysis
- Authors: Xuanhua Yin, Dingxin Zhang, Jianhui Yu, Weidong Cai,
- Abstract summary: We introduce Handcrafted Feature-Based Rotation-Invariant Masked Autoencoder (HFBRI-MAE)<n>HFBRI-MAE is a novel framework that refines the MAE design with rotation-invariant handcrafted features to ensure stable feature learning across different orientations.<n>We show that HFBRI-MAE consistently outperforms existing methods in object classification, segmentation, and few-shot learning.
- Score: 10.978894026853675
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning (SSL) has demonstrated remarkable success in 3D point cloud analysis, particularly through masked autoencoders (MAEs). However, existing MAE-based methods lack rotation invariance, leading to significant performance degradation when processing arbitrarily rotated point clouds in real-world scenarios. To address this limitation, we introduce Handcrafted Feature-Based Rotation-Invariant Masked Autoencoder (HFBRI-MAE), a novel framework that refines the MAE design with rotation-invariant handcrafted features to ensure stable feature learning across different orientations. By leveraging both rotation-invariant local and global features for token embedding and position embedding, HFBRI-MAE effectively eliminates rotational dependencies while preserving rich geometric structures. Additionally, we redefine the reconstruction target to a canonically aligned version of the input, mitigating rotational ambiguities. Extensive experiments on ModelNet40, ScanObjectNN, and ShapeNetPart demonstrate that HFBRI-MAE consistently outperforms existing methods in object classification, segmentation, and few-shot learning, highlighting its robustness and strong generalization ability in real-world 3D applications.
Related papers
- Rotation-Adaptive Point Cloud Domain Generalization via Intricate Orientation Learning [34.424450834358204]
We propose an innovative rotation-adaptive domain generalization framework for 3D point cloud analysis.
Our approach aims to alleviate orientational shifts by leveraging intricate samples in an iterative learning process.
We employ an orientation-aware contrastive learning framework that incorporates an orientation consistency loss and a margin separation loss.
arXiv Detail & Related papers (2025-02-04T11:46:32Z) - MaskLRF: Self-supervised Pretraining via Masked Autoencoding of Local Reference Frames for Rotation-invariant 3D Point Set Analysis [1.19658449368018]
This paper develops, for the first time, a rotation-invariant self-supervised pretraining framework for practical 3D point set analysis.
The proposed algorithm, called MaskLRF, learns rotation-invariant and highly generalizable latent features via masked autoencoding of 3D points.
I confirm that MaskLRF achieves new state-of-the-art accuracies in analyzing 3D point sets having inconsistent orientations.
arXiv Detail & Related papers (2024-03-01T00:42:49Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [59.13757801286343]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.<n>We introduce the FILP-3D framework with two novel components: the Redundant Feature Eliminator (RFE) for feature space misalignment and the Spatial Noise Compensator (SNC) for significant noise.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - Towards Compact 3D Representations via Point Feature Enhancement Masked
Autoencoders [52.66195794216989]
We propose Point Feature Enhancement Masked Autoencoders (Point-FEMAE) to learn compact 3D representations.
Point-FEMAE consists of a global branch and a local branch to capture latent semantic features.
Our method significantly improves the pre-training efficiency compared to cross-modal alternatives.
arXiv Detail & Related papers (2023-12-17T14:17:05Z) - StarNet: Style-Aware 3D Point Cloud Generation [82.30389817015877]
StarNet is able to reconstruct and generate high-fidelity and even 3D point clouds using a mapping network.
Our framework achieves comparable state-of-the-art performance on various metrics in the point cloud reconstruction and generation tasks.
arXiv Detail & Related papers (2023-03-28T08:21:44Z) - Rotation-Invariant Transformer for Point Cloud Matching [42.5714375149213]
We introduce RoITr, a Rotation-Invariant Transformer to cope with the pose variations in the point cloud matching task.
We propose a global transformer with rotation-invariant cross-frame spatial awareness learned by the self-attention mechanism.
RoITr surpasses the existing methods by at least 13 and 5 percentage points in terms of Inlier Ratio and Registration Recall.
arXiv Detail & Related papers (2023-03-14T20:55:27Z) - ReF -- Rotation Equivariant Features for Local Feature Matching [30.459559206664427]
We propose an alternative, complementary approach that centers on inducing bias in the model architecture itself to generate rotation-specific' features.
We demonstrate that this high performance, rotation-specific coverage from the steerable CNNs can be expanded to all rotation angles.
We present a detailed analysis of the performance effects of ensembling, robust estimation, network architecture variations, and the use of rotation priors.
arXiv Detail & Related papers (2022-03-10T07:36:09Z) - ART-Point: Improving Rotation Robustness of Point Cloud Classifiers via
Adversarial Rotation [89.47574181669903]
In this study, we show that the rotation robustness of point cloud classifiers can also be acquired via adversarial training.
Specifically, our proposed framework named ART-Point regards the rotation of the point cloud as an attack.
We propose a fast one-step optimization to efficiently reach the final robust model.
arXiv Detail & Related papers (2022-03-08T07:20:16Z) - The Devil is in the Task: Exploiting Reciprocal Appearance-Localization
Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving.
We introduce a Dynamic Feature Reflecting Network, named DFR-Net.
We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z) - Attentive Rotation Invariant Convolution for Point Cloud-based Large
Scale Place Recognition [11.433270318356675]
We propose an Attentive Rotation Invariant Convolution (ARIConv) in this paper.
We experimentally demonstrate that our model can achieve state-of-the-art performance on large scale place recognition task when the point cloud scans are rotated.
arXiv Detail & Related papers (2021-08-29T09:10:56Z) - Adjoint Rigid Transform Network: Task-conditioned Alignment of 3D Shapes [86.2129580231191]
Adjoint Rigid Transform (ART) Network is a neural module which can be integrated with a variety of 3D networks.
ART learns to rotate input shapes to a learned canonical orientation, which is crucial for a lot of tasks.
We will release our code and pre-trained models for further research.
arXiv Detail & Related papers (2021-02-01T20:58:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.