RW-Net: Enhancing Few-Shot Point Cloud Classification with a Wavelet Transform Projection-based Network
- URL: http://arxiv.org/abs/2501.03221v1
- Date: Mon, 06 Jan 2025 18:55:59 GMT
- Title: RW-Net: Enhancing Few-Shot Point Cloud Classification with a Wavelet Transform Projection-based Network
- Authors: Haosheng Zhang, Hao Huang,
- Abstract summary: This work introduces RW-Net, a novel framework designed to address the challenges above by integrating Rate-Distortion Explanation (RDE) and wavelet transform.<n>By emphasizing low-frequency components of the input data, the wavelet transform captures fundamental geometric and structural attributes of 3D objects.<n>The results demonstrate that our approach achieves state-of-the-art performance and exhibits superior generalization and robustness in few-shot learning scenarios.
- Score: 6.305913808037513
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the domain of 3D object classification, a fundamental challenge lies in addressing the scarcity of labeled data, which limits the applicability of traditional data-intensive learning paradigms. This challenge is particularly pronounced in few-shot learning scenarios, where the objective is to achieve robust generalization from minimal annotated samples. To overcome these limitations, it is crucial to identify and leverage the most salient and discriminative features of 3D objects, thereby enhancing learning efficiency and reducing dependency on large-scale labeled datasets. This work introduces RW-Net, a novel framework designed to address the challenges above by integrating Rate-Distortion Explanation (RDE) and wavelet transform into a state-of-the-art projection-based 3D object classification architecture. The proposed method capitalizes on RDE to extract critical features by identifying and preserving the most informative data components while reducing redundancy. This process ensures the retention of essential information for effective decision-making, optimizing the model's ability to learn from limited data. Complementing RDE, incorporating the wavelet transform further enhances the framework's capability to generalize in low-data regimes. By emphasizing low-frequency components of the input data, the wavelet transform captures fundamental geometric and structural attributes of 3D objects. These attributes are instrumental in mitigating overfitting and improving the robustness of the learned representations across diverse tasks and domains. To validate the effectiveness of our RW-Net, we conduct extensive experiments on three datasets: ModelNet40, ModelNet40-C, and ScanObjectNN for few-shot 3D object classification. The results demonstrate that our approach achieves state-of-the-art performance and exhibits superior generalization and robustness in few-shot learning scenarios.
Related papers
- Enhanced Mixture 3D CGAN for Completion and Generation of 3D Objects [0.2624902795082451]
The generation and completion of 3D objects represent a transformative challenge in computer vision.<n>In this paper, we investigate the integration of Deep 3D Convolutional GANs with a MoE framework to generate high-quality 3D models.
arXiv Detail & Related papers (2026-02-08T16:32:41Z) - A Lightweight 3D Anomaly Detection Method with Rotationally Invariant Features [60.76577388438418]
3D anomaly detection (AD) is a crucial task in computer vision, aiming to identify anomalous points or regions from point cloud data.<n>Existing methods may encounter challenges when handling point clouds with changes in orientation and position because the resulting features may vary significantly.<n>We propose a novel Rotationally Invariant Features (RIF) framework for 3D AD, which maps each point into a rotationally invariant space to maintain consistency of representation.
arXiv Detail & Related papers (2025-11-17T08:16:05Z) - OUGS: Active View Selection via Object-aware Uncertainty Estimation in 3DGS [14.124481717283544]
OUGS is a principled, physically-grounded uncertainty formulation for 3DGS.<n>Our core innovation is to derive uncertainty directly from the explicit physical parameters of the 3D Gaussian primitives.<n>This foundation allows us to then seamlessly integrate semantic segmentation masks to produce a targeted, object-aware uncertainty score.
arXiv Detail & Related papers (2025-11-12T15:08:46Z) - NF-SLAM: Effective, Normalizing Flow-supported Neural Field representations for object-level visual SLAM in automotive applications [20.963261049295085]
We propose a vision-only object-level SLAM framework for automotive applications representing 3D shapes by implicit signed distance functions.
Our key innovation consists of augmenting the standard neural representation by a normalizing flow network.
The newly proposed architecture exhibits a significant performance improvement in the presence of only sparse and noisy data.
arXiv Detail & Related papers (2025-03-14T08:46:56Z) - From Dataset to Real-world: General 3D Object Detection via Generalized Cross-domain Few-shot Learning [13.282416396765392]
We introduce the first generalized cross-domain few-shot (GCFS) task in 3D object detection.
Our solution integrates multi-modal fusion and contrastive-enhanced prototype learning within one framework.
To effectively capture domain-specific representations for each class from limited target data, we propose a contrastive-enhanced prototype learning.
arXiv Detail & Related papers (2025-03-08T17:05:21Z) - Study of Dropout in PointPillars with 3D Object Detection [0.0]
3D object detection is critical for autonomous driving, leveraging deep learning techniques to interpret LiDAR data.
This study provides an analysis of enhancing the performance of PointPillars model under various dropout rates.
arXiv Detail & Related papers (2024-09-01T09:30:54Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Large receptive field strategy and important feature extraction strategy
in 3D object detection [6.3948571459793975]
This study focuses on key challenges in 3D target detection.
To tackle the challenge of expanding the receptive field of a 3D convolutional kernel, we introduce the Dynamic Feature Fusion Module.
This module achieves adaptive expansion of the 3D convolutional kernel's receptive field, balancing the expansion with acceptable computational loads.
arXiv Detail & Related papers (2024-01-22T13:01:28Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - Unsupervised Domain Adaptation for Monocular 3D Object Detection via
Self-Training [57.25828870799331]
We propose STMono3D, a new self-teaching framework for unsupervised domain adaptation on Mono3D.
We develop a teacher-student paradigm to generate adaptive pseudo labels on the target domain.
STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset.
arXiv Detail & Related papers (2022-04-25T12:23:07Z) - Learning-based Point Cloud Registration for 6D Object Pose Estimation in
the Real World [55.7340077183072]
We tackle the task of estimating the 6D pose of an object from point cloud data.
Recent learning-based approaches to addressing this task have shown great success on synthetic datasets.
We analyze the causes of these failures, which we trace back to the difference between the feature distributions of the source and target point clouds.
arXiv Detail & Related papers (2022-03-29T07:55:04Z) - Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics.
Recent neural implicit modeling methods show promising results on synthetic or dense datasets.
But, they perform poorly on real-world data that is sparse and noisy.
This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z) - Progressive Self-Guided Loss for Salient Object Detection [102.35488902433896]
We present a progressive self-guided loss function to facilitate deep learning-based salient object detection in images.
Our framework takes advantage of adaptively aggregated multi-scale features to locate and detect salient objects effectively.
arXiv Detail & Related papers (2021-01-07T07:33:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.