Related papers: IAENet: An Importance-Aware Ensemble Model for 3D Point Cloud-Based Anomaly Detection

IAENet: An Importance-Aware Ensemble Model for 3D Point Cloud-Based Anomaly Detection

URL: http://arxiv.org/abs/2508.20492v1
Date: Thu, 28 Aug 2025 07:19:07 GMT
Title: IAENet: An Importance-Aware Ensemble Model for 3D Point Cloud-Based Anomaly Detection
Authors: Xuanming Cao, Chengyu Tao, Yifeng Cheng, Juan Du,
Abstract summary: We argue that the key bottleneck is the absence of powerful pretrained foundation backbones in 3D comparable to those in 2D.<n>We propose Importance-Aware Ensemble Network (IAENet), an ensemble framework that synergizes 2D pretrained expert with 3D expert models.<n>IAENet achieves a new state-of-the-art with a markedly lower false positive rate, underscoring its practical value for industrial deployment.
Score: 2.08058961865456
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Surface anomaly detection is pivotal for ensuring product quality in industrial manufacturing. While 2D image-based methods have achieved remarkable success, 3D point cloud-based detection remains underexplored despite its richer geometric cues. We argue that the key bottleneck is the absence of powerful pretrained foundation backbones in 3D comparable to those in 2D. To bridge this gap, we propose Importance-Aware Ensemble Network (IAENet), an ensemble framework that synergizes 2D pretrained expert with 3D expert models. However, naively fusing predictions from disparate sources is non-trivial: existing strategies can be affected by a poorly performing modality and thus degrade overall accuracy. To address this challenge, We introduce an novel Importance-Aware Fusion (IAF) module that dynamically assesses the contribution of each source and reweights their anomaly scores. Furthermore, we devise critical loss functions that explicitly guide the optimization of IAF, enabling it to combine the collective knowledge of the source experts but also preserve their unique strengths, thereby enhancing the overall performance of anomaly detection. Extensive experiments on MVTec 3D-AD demonstrate that our IAENet achieves a new state-of-the-art with a markedly lower false positive rate, underscoring its practical value for industrial deployment.

Related papers

Cross-Modal Mapping and Dual-Branch Reconstruction for 2D-3D Multimodal Industrial Anomaly Detection [6.632019014616859]
textbfCMDR-IAD is an unsupervised framework for reliable anomaly detection in 2D+3D multimodal as well as single-modality settings.<n> CMDR-IAD achieves state-of-the-art performance while operating without memory banks, reaching 97.3% image-level AUROC (I-AUROC), 99.6% pixel-level AUROC (P-AUROC), and 97.6% AUPRO.
arXiv Detail & Related papers (2026-03-04T10:57:32Z)
Mono3DV: Monocular 3D Object Detection with 3D-Aware Bipartite Matching and Variational Query DeNoising [0.6423989407081764]
Mono3DV is a novel Transformer-based framework for 3D object detection.<n>We develop a 3D-Aware Bipartite Matching strategy that directly incorporates 3D geometric information into the matching cost.<n>Second, it is important to stabilize the Bipartite Matching to resolve the instability occurring when integrating 3D attributes.
arXiv Detail & Related papers (2026-01-03T02:06:28Z)
IEC3D-AD: A 3D Dataset of Industrial Equipment Components for Unsupervised Point Cloud Anomaly Detection [16.60482902001866]
3D anomaly detection (3D-AD) plays a critical role in industrial manufacturing, particularly in ensuring the reliability and safety of core equipment components.<n>Existing 3D datasets like Real3D-AD and MVTec 3D-AD offer broad application support, but fall short in capturing the complexities and subtle defects found in real industrial environments.<n>We have developed a point cloud anomaly detection dataset ( IEC3D-AD) specific to real industrial scenarios.<n>This dataset is directly collected from actual production lines, ensuring high fidelity and relevance.
arXiv Detail & Related papers (2025-11-05T08:01:23Z)
2D_3D Feature Fusion via Cross-Modal Latent Synthesis and Attention Guided Restoration for Industrial Anomaly Detection [9.873449426376787]
We propose a novel unsupervised framework, Multi-Modal Attention-Driven Fusion Restoration (MAFR)<n>MAFR synthesises a unified latent space from RGB images and point clouds using a shared fusion encoder, followed by attention-guided, modality-specific decoders.<n>Anomalies are localised by measuring reconstruction errors between input features and their restored counterparts.
arXiv Detail & Related papers (2025-10-20T03:57:50Z)
GEAL: Generalizable 3D Affordance Learning with Cross-Modal Consistency [50.11520458252128]
Existing 3D affordance learning methods struggle with generalization and robustness due to limited annotated data.<n>We propose GEAL, a novel framework designed to enhance the generalization and robustness of 3D affordance learning by leveraging large-scale pre-trained 2D models.<n>GEAL consistently outperforms existing methods across seen and novel object categories, as well as corrupted data.
arXiv Detail & Related papers (2024-12-12T17:59:03Z)
DM3D: Distortion-Minimized Weight Pruning for Lossless 3D Object Detection [42.07920565812081]
We propose a novel post-training weight pruning scheme for 3D object detection. It determines redundant parameters in the pretrained model that lead to minimal distortion in both locality and confidence. This framework aims to minimize detection distortion of network output to maximally maintain detection precision.
arXiv Detail & Related papers (2024-07-02T09:33:32Z)
FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [59.13757801286343]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.<n>We introduce the FILP-3D framework with two novel components: the Redundant Feature Eliminator (RFE) for feature space misalignment and the Spatial Noise Compensator (SNC) for significant noise.
arXiv Detail & Related papers (2023-12-28T14:52:07Z)
OriCon3D: Effective 3D Object Detection using Orientation and Confidence [0.0]
We propose an advanced methodology for the detection of 3D objects from a single image. We use a deep convolutional neural network-based 3D object weighted orientation regression paradigm. Our approach significantly improves the accuracy of 3D object pose determination, surpassing baseline methodologies.
arXiv Detail & Related papers (2023-04-27T19:52:47Z)
Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information. Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z)
The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection [62.1185839286255]
Low-cost monocular 3D object detection plays a fundamental role in autonomous driving. We introduce a Dynamic Feature Reflecting Network, named DFR-Net. We rank 1st among all the monocular 3D object detectors in the KITTI test set.
arXiv Detail & Related papers (2021-12-28T07:31:18Z)
Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics. Recent neural implicit modeling methods show promising results on synthetic or dense datasets. But, they perform poorly on real-world data that is sparse and noisy. This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z)
SESS: Self-Ensembling Semi-Supervised 3D Object Detection [138.80825169240302]
We propose SESS, a self-ensembling semi-supervised 3D object detection framework. Specifically, we design a thorough perturbation scheme to enhance generalization of the network on unlabeled and new unseen data. Our SESS achieves competitive performance compared to the state-of-the-art fully-supervised method by using only 50% labeled data.
arXiv Detail & Related papers (2019-12-26T08:48:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.