MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
- URL: http://arxiv.org/abs/2504.06740v1
- Date: Wed, 09 Apr 2025 09:52:04 GMT
- Title: MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning
- Authors: Ylli Sadikaj, Hongkuan Zhou, Lavdim Halilaj, Stefan Schmid, Steffen Staab, Claudia Plant,
- Abstract summary: It is crucial to know the distinct type of defect, such as a bent, cut, or scratch.<n>The ability to recognize the "exact" defect type enables automated treatments of the anomalies in modern production lines.<n>We propose MultiADS, a zero-shot learning approach, able to perform Multi-type Anomaly Detection.
- Score: 27.235318937019255
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Precise optical inspection in industrial applications is crucial for minimizing scrap rates and reducing the associated costs. Besides merely detecting if a product is anomalous or not, it is crucial to know the distinct type of defect, such as a bent, cut, or scratch. The ability to recognize the "exact" defect type enables automated treatments of the anomalies in modern production lines. Current methods are limited to solely detecting whether a product is defective or not without providing any insights on the defect type, nevertheless detecting and identifying multiple defects. We propose MultiADS, a zero-shot learning approach, able to perform Multi-type Anomaly Detection and Segmentation. The architecture of MultiADS comprises CLIP and extra linear layers to align the visual- and textual representation in a joint feature space. To the best of our knowledge, our proposal, is the first approach to perform a multi-type anomaly segmentation task in zero-shot learning. Contrary to the other baselines, our approach i) generates specific anomaly masks for each distinct defect type, ii) learns to distinguish defect types, and iii) simultaneously identifies multiple defect types present in an anomalous product. Additionally, our approach outperforms zero/few-shot learning SoTA methods on image-level and pixel-level anomaly detection and segmentation tasks on five commonly used datasets: MVTec-AD, Visa, MPDD, MAD and Real-IAD.
Related papers
- Learning Multi-view Multi-class Anomaly Detection [10.199404082194947]
We introduce a Multi-View Multi-Class Anomaly Detection model (MVMCAD), which integrates information from multiple views to accurately identify anomalies.
Specifically, we propose a semi-frozen encoder, where a pre-encoder prior enhancement mechanism is added before the frozen encoder.
An Anomaly Amplification Module (AAM) that models global token interactions and suppresses normal regions, and a Cross-Feature Loss that aligns shallow encoder features with deep decoder features.
arXiv Detail & Related papers (2025-04-30T03:59:58Z) - Crane: Context-Guided Prompt Learning and Attention Refinement for Zero-Shot Anomaly Detections [50.343419243749054]
Anomaly Detection (AD) involves identifying deviations from normal data distributions.
We propose a novel approach that conditions the prompts of the text encoder based on image context extracted from the vision encoder.
Our method achieves state-of-the-art performance, improving performance by 2% to 29% across different metrics on 14 datasets.
arXiv Detail & Related papers (2025-04-15T10:42:25Z) - PA-CLIP: Enhancing Zero-Shot Anomaly Detection through Pseudo-Anomaly Awareness [10.364634539199422]
We introduce PA-CLIP, a zero-shot anomaly detection method that reduces background noise and enhances defect detection through a pseudo-anomaly-based framework.<n>The proposed method integrates a multiscale feature aggregation strategy for capturing detailed global and local information.<n>It outperforms existing zero-shot methods, providing a robust solution for industrial defect detection.
arXiv Detail & Related papers (2025-03-03T08:29:27Z) - Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection [88.34095233600719]
FAPrompt is a novel framework designed to learn Fine-grained Abnormality Prompts for more accurate ZSAD.
It substantially outperforms state-of-the-art methods by at least 3%-5% AUC/AP in both image- and pixel-level ZSAD tasks.
arXiv Detail & Related papers (2024-10-14T08:41:31Z) - Open-Vocabulary Video Anomaly Detection [57.552523669351636]
Video anomaly detection (VAD) with weak supervision has achieved remarkable performance in utilizing video-level labels to discriminate whether a video frame is normal or abnormal.
Recent studies attempt to tackle a more realistic setting, open-set VAD, which aims to detect unseen anomalies given seen anomalies and normal videos.
This paper takes a step further and explores open-vocabulary video anomaly detection (OVVAD), in which we aim to leverage pre-trained large models to detect and categorize seen and unseen anomalies.
arXiv Detail & Related papers (2023-11-13T02:54:17Z) - Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection [86.24898024621008]
We present a novel large multimodal model applying vision experts for industrial anomaly detection(abbreviated to Myriad)<n>We utilize the anomaly map generated by the vision experts as guidance for LMMs, such that the vision model is guided to pay more attention to anomalous regions.<n>Our proposed method not only performs favorably against state-of-the-art methods, but also inherits the flexibility and instruction-following ability of LMMs in the field of IAD.
arXiv Detail & Related papers (2023-10-29T16:49:45Z) - Improving Vision Anomaly Detection with the Guidance of Language
Modality [64.53005837237754]
This paper tackles the challenges for vision modality from a multimodal point of view.
We propose Cross-modal Guidance (CMG) to tackle the redundant information issue and sparse space issue.
To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality.
arXiv Detail & Related papers (2023-10-04T13:44:56Z) - UniFormaly: Towards Task-Agnostic Unified Framework for Visual Anomaly
Detection [6.260747047974035]
We present UniFormaly, a universal and powerful anomaly detection framework.
We emphasize the necessity of our off-the-shelf approach by pointing out a suboptimal issue in online encoder-based methods.
UniFormaly achieves outstanding results on various tasks and datasets.
arXiv Detail & Related papers (2023-07-24T06:04:12Z) - Multimodal Industrial Anomaly Detection via Hybrid Fusion [59.16333340582885]
We propose a novel multimodal anomaly detection method with hybrid fusion scheme.
Our model outperforms the state-of-the-art (SOTA) methods on both detection and segmentation precision on MVTecD-3 AD dataset.
arXiv Detail & Related papers (2023-03-01T15:48:27Z) - Deep Learning based Defect classification and detection in SEM images: A
Mask R-CNN approach [2.7180863515048674]
We have demonstrated the application of Mask-RCNN (Regional Convolutional Neural Network), a deep-learning algorithm for computer vision.
We are aiming at detecting and segmenting different types of inter-class defect patterns such as bridge, break, and line collapse.
arXiv Detail & Related papers (2022-11-03T23:26:40Z) - Self-Supervised Predictive Convolutional Attentive Block for Anomaly
Detection [97.93062818228015]
We propose to integrate the reconstruction-based functionality into a novel self-supervised predictive architectural building block.
Our block is equipped with a loss that minimizes the reconstruction error with respect to the masked area in the receptive field.
We demonstrate the generality of our block by integrating it into several state-of-the-art frameworks for anomaly detection on image and video.
arXiv Detail & Related papers (2021-11-17T13:30:31Z) - MLMA-Net: multi-level multi-attentional learning for multi-label object
detection in textile defect images [0.0]
This paper proposes a multi-level, multi-attentional deep learning network (MLMA-Net) to detect multi-label defects in textile images.
The results show that the network extracts more distinctive features and has better performance than the state-of-the-art approaches on the real-world industrial dataset.
arXiv Detail & Related papers (2021-01-31T04:50:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.