Multimodal Industrial Anomaly Detection via Hybrid Fusion
- URL: http://arxiv.org/abs/2303.00601v2
- Date: Thu, 7 Sep 2023 08:28:39 GMT
- Title: Multimodal Industrial Anomaly Detection via Hybrid Fusion
- Authors: Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, Chengjie
Wang
- Abstract summary: We propose a novel multimodal anomaly detection method with hybrid fusion scheme.
Our model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTecD-3 AD dataset.
- Score: 59.16333340582885
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 2D-based Industrial Anomaly Detection has been widely discussed, however,
multimodal industrial anomaly detection based on 3D point clouds and RGB images
still has many untouched fields. Existing multimodal industrial anomaly
detection methods directly concatenate the multimodal features, which leads to
a strong disturbance between features and harms the detection performance. In
this paper, we propose Multi-3D-Memory (M3DM), a novel multimodal anomaly
detection method with hybrid fusion scheme: firstly, we design an unsupervised
feature fusion with patch-wise contrastive learning to encourage the
interaction of different modal features; secondly, we use a decision layer
fusion with multiple memory banks to avoid loss of information and additional
novelty classifiers to make the final decision. We further propose a point
feature alignment operation to better align the point cloud and RGB features.
Extensive experiments show that our multimodal industrial anomaly detection
model outperforms the state-of-the-art (SOTA) methods on both detection and
segmentation precision on MVTec-3D AD dataset. Code is available at
https://github.com/nomewang/M3DM.
Related papers
- A Survey on RGB, 3D, and Multimodal Approaches for Unsupervised Industrial Anomaly Detection [24.634671653473397]
Unsupervised Industrial Anomaly Detection (UIAD) technology effectively overcomes the scarcity of abnormal samples and enhances the automation and reliability of smart manufacturing.
RGB, 3D, and multimodal anomaly detection have demonstrated comprehensive and robust capabilities within the industrial informatization sector.
We focus on 3D UIAD and multimodal UIAD, providing a comprehensive summary of unsupervised industrial anomaly detection in three modal settings.
arXiv Detail & Related papers (2024-10-29T12:12:45Z) - M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising [63.39134873744748]
Existing industrial anomaly detection methods primarily concentrate on unsupervised learning with pristine RGB images.
This paper proposes a novel noise-resistant M3DM-NR framework to leverage strong multi-modal discriminative capabilities of CLIP.
Extensive experiments show that M3DM-NR outperforms state-of-the-art methods in 3D-RGB multi-modal noisy anomaly detection.
arXiv Detail & Related papers (2024-06-04T12:33:02Z) - Multimodal Industrial Anomaly Detection by Crossmodal Feature Mapping [12.442574943138794]
The paper explores the industrial multimodal Anomaly Detection (AD) task, which exploits point clouds and RGB images to localize anomalies.
We introduce a novel light and fast framework that learns to map features from one modality to the other on nominal samples.
arXiv Detail & Related papers (2023-12-07T18:41:21Z) - Dual-Branch Reconstruction Network for Industrial Anomaly Detection with
RGB-D Data [1.861332908680942]
Multi-modal industrial anomaly detection based on 3D point clouds and RGB images is just beginning to emerge.
The above methods require a longer inference time and higher memory usage, which cannot meet the real-time requirements of the industry.
We propose a lightweight dual-branch reconstruction network based on RGB-D input, learning the decision boundary between normal and abnormal examples.
arXiv Detail & Related papers (2023-11-12T10:19:14Z) - Multi-Modal 3D Object Detection by Box Matching [109.43430123791684]
We propose a novel Fusion network by Box Matching (FBMNet) for multi-modal 3D detection.
With the learned assignments between 3D and 2D object proposals, the fusion for detection can be effectively performed by combing their ROI features.
arXiv Detail & Related papers (2023-05-12T18:08:51Z) - Weakly Aligned Feature Fusion for Multimodal Object Detection [52.15436349488198]
multimodal data often suffer from the position shift problem, i.e., the image pair is not strictly aligned.
This problem makes it difficult to fuse multimodal features and puzzles the convolutional neural network (CNN) training.
In this article, we propose a general multimodal detector named aligned region CNN (AR-CNN) to tackle the position shift problem.
arXiv Detail & Related papers (2022-04-21T02:35:23Z) - EPNet++: Cascade Bi-directional Fusion for Multi-Modal 3D Object
Detection [56.03081616213012]
We propose EPNet++ for multi-modal 3D object detection by introducing a novel Cascade Bi-directional Fusion(CB-Fusion) module.
The proposed CB-Fusion module boosts the plentiful semantic information of point features with the image features in a cascade bi-directional interaction fusion manner.
The experiment results on the KITTI, JRDB and SUN-RGBD datasets demonstrate the superiority of EPNet++ over the state-of-the-art methods.
arXiv Detail & Related papers (2021-12-21T10:48:34Z) - Exploring Data Augmentation for Multi-Modality 3D Object Detection [82.9988604088494]
It is counter-intuitive that multi-modality methods based on point cloud and images perform only marginally better or sometimes worse than approaches that solely use point cloud.
We propose a pipeline, named transformation flow, to bridge the gap between single and multi-modality data augmentation with transformation reversing and replaying.
Our method also wins the best PKL award in the 3rd nuScenes detection challenge.
arXiv Detail & Related papers (2020-12-23T15:23:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.