UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection
- URL: http://arxiv.org/abs/2412.03342v2
- Date: Thu, 05 Dec 2024 03:31:40 GMT
- Title: UniVAD: A Training-free Unified Model for Few-shot Visual Anomaly Detection
- Authors: Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, Jinqiao Wang,
- Abstract summary: We propose a few-shot Visual Anomaly Detection (VAD) method, UniVAD, capable of detecting anomalies across various domains.
UniVAD only needs few normal samples as references during testing to detect anomalies in previously unseen objects.
We conduct experiments on nine datasets spanning industrial, logical, and medical fields, and the results demonstrate that UniVAD achieves state-of-the-art performance.
- Score: 28.994585945398754
- License:
- Abstract: Visual Anomaly Detection (VAD) aims to identify abnormal samples in images that deviate from normal patterns, covering multiple domains, including industrial, logical, and medical fields. Due to the domain gaps between these fields, existing VAD methods are typically tailored to each domain, with specialized detection techniques and model architectures that are difficult to generalize across different domains. Moreover, even within the same domain, current VAD approaches often follow a "one-category-one-model" paradigm, requiring large amounts of normal samples to train class-specific models, resulting in poor generalizability and hindering unified evaluation across domains. To address this issue, we propose a generalized few-shot VAD method, UniVAD, capable of detecting anomalies across various domains, such as industrial, logical, and medical anomalies, with a training-free unified model. UniVAD only needs few normal samples as references during testing to detect anomalies in previously unseen objects, without training on the specific domain. Specifically, UniVAD employs a Contextual Component Clustering ($C^3$) module based on clustering and vision foundation models to segment components within the image accurately, and leverages Component-Aware Patch Matching (CAPM) and Graph-Enhanced Component Modeling (GECM) modules to detect anomalies at different semantic levels, which are aggregated to produce the final detection result. We conduct experiments on nine datasets spanning industrial, logical, and medical fields, and the results demonstrate that UniVAD achieves state-of-the-art performance in few-shot anomaly detection tasks across multiple domains, outperforming domain-specific anomaly detection models. The code will be made publicly available.
Related papers
- Object Style Diffusion for Generalized Object Detection in Urban Scene [69.04189353993907]
We introduce a novel single-domain object detection generalization method, named GoDiff.
By integrating pseudo-target domain data with source domain data, we diversify the training dataset.
Experimental results demonstrate that our method not only enhances the generalization ability of existing detectors but also functions as a plug-and-play enhancement for other single-domain generalization methods.
arXiv Detail & Related papers (2024-12-18T13:03:00Z) - Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - GM-DF: Generalized Multi-Scenario Deepfake Detection [49.072106087564144]
Existing face forgery detection usually follows the paradigm of training models in a single domain.
In this paper, we elaborately investigate the generalization capacity of deepfake detection models when jointly trained on multiple face forgery detection datasets.
arXiv Detail & Related papers (2024-06-28T17:42:08Z) - Prior Normality Prompt Transformer for Multi-class Industrial Image Anomaly Detection [6.865429486202104]
We introduce Prior Normality Prompt Transformer (PNPT) for multi-class anomaly detection.
PNPT strategically incorporates normal semantics prompting to mitigate the "identical mapping" problem.
This entails integrating a prior normality prompt into the reconstruction process, yielding a dual-stream model.
arXiv Detail & Related papers (2024-06-17T13:10:04Z) - Absolute-Unified Multi-Class Anomaly Detection via Class-Agnostic Distribution Alignment [27.375917265177847]
Unsupervised anomaly detection (UAD) methods build separate models for each object category.
Recent studies have proposed to train a unified model for multiple classes, namely model-unified UAD.
We present a simple yet powerful method to address multi-class anomaly detection without any class information, namely textitabsolute-unified UAD.
arXiv Detail & Related papers (2024-03-31T15:50:52Z) - Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts [25.629973843455495]
Generalist Anomaly Detection (GAD) aims to train one single detection model that can generalize to detect anomalies in diverse datasets from different application domains without further training on the target data.
We introduce a novel approach that learns an in-context residual learning model for GAD, termed InCTRL.
InCTRL is the best performer and significantly outperforms state-of-the-art competing methods.
arXiv Detail & Related papers (2024-03-11T08:07:46Z) - AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection [30.679012320439625]
AnomalyCLIP learns object-agnostic text prompts to capture generic normality and abnormality in an image.
It achieves superior zero-shot performance of detecting and segmenting anomalies in datasets of highly diverse class semantics.
arXiv Detail & Related papers (2023-10-29T10:03:49Z) - Spectral Adversarial MixUp for Few-Shot Unsupervised Domain Adaptation [72.70876977882882]
Domain shift is a common problem in clinical applications, where the training images (source domain) and the test images (target domain) are under different distributions.
We propose a novel method for Few-Shot Unsupervised Domain Adaptation (FSUDA), where only a limited number of unlabeled target domain samples are available for training.
arXiv Detail & Related papers (2023-09-03T16:02:01Z) - PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and
Localization [64.39761523935613]
We present a new framework for Patch Distribution Modeling, PaDiM, to concurrently detect and localize anomalies in images.
PaDiM makes use of a pretrained convolutional neural network (CNN) for patch embedding.
It also exploits correlations between the different semantic levels of CNN to better localize anomalies.
arXiv Detail & Related papers (2020-11-17T17:29:18Z) - Anomaly Detection with Domain Adaptation [5.457279006229213]
We propose the Invariant Representation Anomaly Detection (IRAD) to solve this problem.
The extraction is achieved by an across-domain encoder trained together with source-specific encoders and generators by adversarial learning.
We evaluate IRAD extensively on digits images datasets (MNIST, USPS and SVHN) and object recognition datasets (Office-Home)
arXiv Detail & Related papers (2020-06-05T21:05:19Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.