Random Word Data Augmentation with CLIP for Zero-Shot Anomaly Detection
- URL: http://arxiv.org/abs/2308.11119v2
- Date: Fri, 22 Sep 2023 18:08:38 GMT
- Title: Random Word Data Augmentation with CLIP for Zero-Shot Anomaly Detection
- Authors: Masato Tamura
- Abstract summary: This paper presents a novel method that leverages a visual-language model, CLIP, as a data source for zero-shot anomaly detection.
Using the generated embeddings as training data, a feed-forward neural network learns to extract features of normal and anomaly from CLIP's embeddings.
Experimental results demonstrate that our method achieves state-of-the-art performance without laborious prompt ensembling in zero-shot setups.
- Score: 3.75292409381511
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a novel method that leverages a visual-language model,
CLIP, as a data source for zero-shot anomaly detection. Tremendous efforts have
been put towards developing anomaly detectors due to their potential industrial
applications. Considering the difficulty in acquiring various anomalous samples
for training, most existing methods train models with only normal samples and
measure discrepancies from the distribution of normal samples during inference,
which requires training a model for each object category. The problem of this
inefficient training requirement has been tackled by designing a CLIP-based
anomaly detector that applies prompt-guided classification to each part of an
image in a sliding window manner. However, the method still suffers from the
labor of careful prompt ensembling with known object categories. To overcome
the issues above, we propose leveraging CLIP as a data source for training. Our
method generates text embeddings with the text encoder in CLIP with typical
prompts that include words of normal and anomaly. In addition to these words,
we insert several randomly generated words into prompts, which enables the
encoder to generate a diverse set of normal and anomalous samples. Using the
generated embeddings as training data, a feed-forward neural network learns to
extract features of normal and anomaly from CLIP's embeddings, and as a result,
a category-agnostic anomaly detector can be obtained without any training
images. Experimental results demonstrate that our method achieves
state-of-the-art performance without laborious prompt ensembling in zero-shot
setups.
Related papers
- Fine-grained Abnormality Prompt Learning for Zero-shot Anomaly Detection [88.34095233600719]
FAPrompt is a novel framework designed to learn Fine-grained Abnormality Prompts for more accurate ZSAD.
It substantially outperforms state-of-the-art methods by at least 3%-5% AUC/AP in both image- and pixel-level ZSAD tasks.
arXiv Detail & Related papers (2024-10-14T08:41:31Z) - Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts [25.629973843455495]
Generalist Anomaly Detection (GAD) aims to train one single detection model that can generalize to detect anomalies in diverse datasets from different application domains without further training on the target data.
We introduce a novel approach that learns an in-context residual learning model for GAD, termed InCTRL.
InCTRL is the best performer and significantly outperforms state-of-the-art competing methods.
arXiv Detail & Related papers (2024-03-11T08:07:46Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - Anomaly Detection in Automated Fibre Placement: Learning with Data
Limitations [3.103778949672542]
We present a comprehensive framework for defect detection and localization in Automated Fibre Placement.
Our approach combines unsupervised deep learning and classical computer vision algorithms.
It efficiently detects various surface issues while requiring fewer images of composite parts for training.
arXiv Detail & Related papers (2023-07-15T22:13:36Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Explainable Deep Few-shot Anomaly Detection with Deviation Networks [123.46611927225963]
We introduce a novel weakly-supervised anomaly detection framework to train detection models.
The proposed approach learns discriminative normality by leveraging the labeled anomalies and a prior probability.
Our model is substantially more sample-efficient and robust, and performs significantly better than state-of-the-art competing methods in both closed-set and open-set settings.
arXiv Detail & Related papers (2021-08-01T14:33:17Z) - Understanding Classifier Mistakes with Generative Models [88.20470690631372]
Deep neural networks are effective on supervised learning tasks, but have been shown to be brittle.
In this paper, we leverage generative models to identify and characterize instances where classifiers fail to generalize.
Our approach is agnostic to class labels from the training set which makes it applicable to models trained in a semi-supervised way.
arXiv Detail & Related papers (2020-10-05T22:13:21Z) - A Background-Agnostic Framework with Adversarial Training for Abnormal
Event Detection in Video [120.18562044084678]
Abnormal event detection in video is a complex computer vision problem that has attracted significant attention in recent years.
We propose a background-agnostic framework that learns from training videos containing only normal events.
arXiv Detail & Related papers (2020-08-27T18:39:24Z) - G2D: Generate to Detect Anomaly [10.977404378308817]
We learn two deep neural networks (generator and discriminator) in a GAN-style setting on merely the normal samples.
In the training phase, when the generator fails to produce normal data, it can be considered as an irregularity generator.
We train a binary classifier on the generated anomalous samples along with the normal instances in order to be capable of detecting irregularities.
arXiv Detail & Related papers (2020-06-20T18:02:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.