Memoryless Multimodal Anomaly Detection via Student-Teacher Network and Signed Distance Learning
- URL: http://arxiv.org/abs/2409.05378v1
- Date: Mon, 9 Sep 2024 07:18:09 GMT
- Title: Memoryless Multimodal Anomaly Detection via Student-Teacher Network and Signed Distance Learning
- Authors: Zhongbin Sun, Xiaolong Li, Yiran Li, Yue Ma,
- Abstract summary: A novel memoryless method MDSS is proposed for multimodal anomaly detection.
It employs a light-weighted student-teacher network and a signed distance function to learn from RGB images and 3D point clouds respectively.
The experimental results indicate that MDSS is comparable but more stable than the SOTA memory bank based method Shape-guided.
- Score: 8.610387986933741
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised anomaly detection is a challenging computer vision task, in which 2D-based anomaly detection methods have been extensively studied. However, multimodal anomaly detection based on RGB images and 3D point clouds requires further investigation. The existing methods are mainly inspired by memory bank based methods commonly used in 2D-based anomaly detection, which may cost extra memory for storing mutimodal features. In present study, a novel memoryless method MDSS is proposed for multimodal anomaly detection, which employs a light-weighted student-teacher network and a signed distance function to learn from RGB images and 3D point clouds respectively, and complements the anomaly information from the two modalities. Specifically, a student-teacher network is trained with normal RGB images and masks generated from point clouds by a dynamic loss, and the anomaly score map could be obtained from the discrepancy between the output of student and teacher. Furthermore, the signed distance function learns from normal point clouds to predict the signed distances between points and surface, and the obtained signed distances are used to generate anomaly score map. Subsequently, the anomaly score maps are aligned to generate the final anomaly score map for detection. The experimental results indicate that MDSS is comparable but more stable than the SOTA memory bank based method Shape-guided, and furthermore performs better than other baseline methods.
Related papers
- M3DM-NR: RGB-3D Noisy-Resistant Industrial Anomaly Detection via Multimodal Denoising [63.39134873744748]
Existing industrial anomaly detection methods primarily concentrate on unsupervised learning with pristine RGB images.
This paper proposes a novel noise-resistant M3DM-NR framework to leverage strong multi-modal discriminative capabilities of CLIP.
Extensive experiments show that M3DM-NR outperforms state-of-the-art methods in 3D-RGB multi-modal noisy anomaly detection.
arXiv Detail & Related papers (2024-06-04T12:33:02Z) - OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation [67.56268991234371]
OV-Uni3DETR achieves the state-of-the-art performance on various scenarios, surpassing existing methods by more than 6% on average.
Code and pre-trained models will be released later.
arXiv Detail & Related papers (2024-03-28T17:05:04Z) - Towards Unified 3D Object Detection via Algorithm and Data Unification [70.27631528933482]
We build the first unified multi-modal 3D object detection benchmark MM- Omni3D and extend the aforementioned monocular detector to its multi-modal version.
We name the designed monocular and multi-modal detectors as UniMODE and MM-UniMODE, respectively.
arXiv Detail & Related papers (2024-02-28T18:59:31Z) - Dual-Student Knowledge Distillation Networks for Unsupervised Anomaly
Detection [2.06682776181122]
Student-teacher networks (S-T) are favored in unsupervised anomaly detection.
However, vanilla S-T networks are not stable.
We propose a novel dual-student knowledge distillation architecture.
arXiv Detail & Related papers (2024-02-01T09:32:39Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Dual-Branch Reconstruction Network for Industrial Anomaly Detection with
RGB-D Data [1.861332908680942]
Multi-modal industrial anomaly detection based on 3D point clouds and RGB images is just beginning to emerge.
The above methods require a longer inference time and higher memory usage, which cannot meet the real-time requirements of the industry.
We propose a lightweight dual-branch reconstruction network based on RGB-D input, learning the decision boundary between normal and abnormal examples.
arXiv Detail & Related papers (2023-11-12T10:19:14Z) - Multimodal Industrial Anomaly Detection via Hybrid Fusion [59.16333340582885]
We propose a novel multimodal anomaly detection method with hybrid fusion scheme.
Our model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTecD-3 AD dataset.
arXiv Detail & Related papers (2023-03-01T15:48:27Z) - Teacher-Student Network for 3D Point Cloud Anomaly Detection with Few
Normal Samples [21.358496646676087]
We design a teacher-student structured model for 3D anomaly detection.
Specifically, we use feature space alignment, dimension zoom, and max pooling to extract the features of the point cloud.
Our method only requires very few normal samples to train the student network.
arXiv Detail & Related papers (2022-10-31T12:29:55Z) - Reconstructed Student-Teacher and Discriminative Networks for Anomaly
Detection [8.35780131268962]
A powerful anomaly detection method is proposed based on student-teacher feature pyramid matching (STPM), which consists of a student and teacher network.
To improve the accuracy of STPM, this work uses a student network, as in generative models, to reconstruct normal features.
To further improve accuracy, a discriminative network trained with pseudo-anomalies from anomaly maps is used in our method.
arXiv Detail & Related papers (2022-10-14T05:57:50Z) - DetMatch: Two Teachers are Better Than One for Joint 2D and 3D
Semi-Supervised Object Detection [29.722784254501768]
DetMatch is a flexible framework for joint semi-supervised learning on 2D and 3D modalities.
By identifying objects detected in both sensors, our pipeline generates a cleaner, more robust set of pseudo-labels.
We leverage the richer semantics of RGB images to rectify incorrect 3D class predictions and improve localization of 3D boxes.
arXiv Detail & Related papers (2022-03-17T17:58:00Z) - UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional
Variational Autoencoders [81.5490760424213]
We propose the first framework (UCNet) to employ uncertainty for RGB-D saliency detection by learning from the data labeling process.
Inspired by the saliency data labeling process, we propose probabilistic RGB-D saliency detection network.
arXiv Detail & Related papers (2020-04-13T04:12:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.