Model Selection of Zero-shot Anomaly Detectors in the Absence of Labeled
Validation Data
- URL: http://arxiv.org/abs/2310.10461v2
- Date: Fri, 9 Feb 2024 16:59:43 GMT
- Title: Model Selection of Zero-shot Anomaly Detectors in the Absence of Labeled
Validation Data
- Authors: Clement Fung, Chen Qiu, Aodong Li, Maja Rudolph
- Abstract summary: Anomaly detection requires detecting abnormal samples in large unlabeled datasets.
We propose SWSA: a framework to select image-based anomaly detectors with a generated synthetic validation set.
We find that SWSA often selects models that match selections made with a ground-truth validation set, resulting in higher AUROCs than baseline methods.
- Score: 19.919234682696306
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Anomaly detection requires detecting abnormal samples in large unlabeled
datasets. While progress in deep learning and the advent of foundation models
has produced powerful zero-shot anomaly detection methods, their deployment in
practice is often hindered by the lack of labeled data -- without it, their
detection performance cannot be evaluated reliably. In this work, we propose
SWSA (Selection With Synthetic Anomalies): a general-purpose framework to
select image-based anomaly detectors with a generated synthetic validation set.
Our proposed anomaly generation method assumes access to only a small support
set of normal images and requires no training or fine-tuning. Once generated,
our synthetic validation set is used to create detection tasks that compose a
validation framework for model selection. In an empirical study, we find that
SWSA often selects models that match selections made with a ground-truth
validation set, resulting in higher AUROCs than baseline methods. We also find
that SWSA selects prompts for CLIP-based anomaly detection that outperform
baseline prompt selection strategies on all datasets, including the challenging
MVTec-AD and VisA datasets.
Related papers
- SKADA-Bench: Benchmarking Unsupervised Domain Adaptation Methods with Realistic Validation [55.87169702896249]
Unsupervised Domain Adaptation (DA) consists of adapting a model trained on a labeled source domain to perform well on an unlabeled target domain with some data distribution shift.
We propose a framework to evaluate DA methods and present a fair evaluation of existing shallow algorithms, including reweighting, mapping, and subspace alignment.
Our benchmark highlights the importance of realistic validation and provides practical guidance for real-life applications.
arXiv Detail & Related papers (2024-07-16T12:52:29Z) - Anomaly Detection of Tabular Data Using LLMs [54.470648484612866]
We show that pre-trained large language models (LLMs) are zero-shot batch-level anomaly detectors.
We propose an end-to-end fine-tuning strategy to bring out the potential of LLMs in detecting real anomalies.
arXiv Detail & Related papers (2024-06-24T04:17:03Z) - ARC: A Generalist Graph Anomaly Detector with In-Context Learning [62.202323209244]
ARC is a generalist GAD approach that enables a one-for-all'' GAD model to detect anomalies across various graph datasets on-the-fly.
equipped with in-context learning, ARC can directly extract dataset-specific patterns from the target dataset.
Extensive experiments on multiple benchmark datasets from various domains demonstrate the superior anomaly detection performance, efficiency, and generalizability of ARC.
arXiv Detail & Related papers (2024-05-27T02:42:33Z) - Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models [25.022166664832596]
We propose a simple and effective approach to detect data contamination in large language models (LLMs) and estimate the amount of it.
We frame data contamination detection as a series of multiple-choice questions and devise a quiz format wherein three perturbed versions of each subsampled instance from a specific dataset partition are created.
Our findings suggest that DCQ achieves state-of-the-art results and uncovers greater contamination/memorization levels compared to existing methods.
arXiv Detail & Related papers (2023-11-10T18:48:58Z) - Active anomaly detection based on deep one-class classification [9.904380236739398]
We tackle two essential problems of active learning for Deep SVDD: query strategy and semi-supervised learning method.
First, rather than solely identifying anomalies, our query strategy selects uncertain samples according to an adaptive boundary.
Second, we apply noise contrastive estimation in training a one-class classification model to incorporate both labeled normal and abnormal data effectively.
arXiv Detail & Related papers (2023-09-18T03:56:45Z) - LafitE: Latent Diffusion Model with Feature Editing for Unsupervised
Multi-class Anomaly Detection [12.596635603629725]
We develop a unified model to detect anomalies from objects belonging to multiple classes when only normal data is accessible.
We first explore the generative-based approach and investigate latent diffusion models for reconstruction.
We introduce a feature editing strategy that modifies the input feature space of the diffusion model to further alleviate identity shortcuts''
arXiv Detail & Related papers (2023-07-16T14:41:22Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - SPADE: Semi-supervised Anomaly Detection under Distribution Mismatch [58.04518381476167]
SPADE shows state-of-the-art semi-supervised anomaly detection performance across a wide range of scenarios with distribution mismatch.
In some common real-world settings such as model facing new types of unlabeled anomalies, SPADE outperforms the state-of-the-art alternatives by 5% AUC in average.
arXiv Detail & Related papers (2022-11-30T23:39:11Z) - Unsupervised Model Selection for Time-series Anomaly Detection [7.8027110514393785]
We identify three classes of surrogate (unsupervised) metrics, namely, prediction error, model centrality, and performance on injected synthetic anomalies.
We formulate metric combination with multiple imperfect surrogate metrics as a robust rank aggregation problem.
Large-scale experiments on multiple real-world datasets demonstrate that our proposed unsupervised approach is as effective as selecting the most accurate model.
arXiv Detail & Related papers (2022-10-03T16:49:30Z) - Self-Supervised Guided Segmentation Framework for Unsupervised Anomaly
Detection [24.26958675342856]
Unsupervised anomaly detection is a challenging task in industrial applications.
The distribution gap between forged and real anomaly samples makes it difficult for models trained based on forged samples to effectively locate real anomalies.
The Self-Supervised Guided Framework (SGSF) is proposed to generate forged anomalous samples and the normal sample features as the guidance information of segmentation for anomaly detection.
arXiv Detail & Related papers (2022-09-26T06:14:56Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.