Generalizable Industrial Visual Anomaly Detection with Self-Induction
Vision Transformer
- URL: http://arxiv.org/abs/2211.12311v1
- Date: Tue, 22 Nov 2022 14:56:12 GMT
- Title: Generalizable Industrial Visual Anomaly Detection with Self-Induction
Vision Transformer
- Authors: Haiming Yao, Xue Wang
- Abstract summary: We propose a self-induction vision Transformer (SIVT) for unsupervised generalizable industrial visual anomaly detection and localization.
The proposed SIVT first extracts discriminatory features from pre-trained CNN as property descriptors, then reconstructs the extracted features in a self-supervisory fashion.
The results reveal that the proposed method can advance state-of-the-art detection performance with an improvement of 2.8-6.3 in AUROC, and 3.3-7.6 in AP.
- Score: 5.116033262865781
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Industrial vision anomaly detection plays a critical role in the advanced
intelligent manufacturing process, while some limitations still need to be
addressed under such a context. First, existing reconstruction-based methods
struggle with the identity mapping of trivial shortcuts where the
reconstruction error gap is legible between the normal and abnormal samples,
leading to inferior detection capabilities. Then, the previous studies mainly
concentrated on the convolutional neural network (CNN) models that capture the
local semantics of objects and neglect the global context, also resulting in
inferior performance. Moreover, existing studies follow the individual learning
fashion where the detection models are only capable of one category of the
product while the generalizable detection for multiple categories has not been
explored. To tackle the above limitations, we proposed a self-induction vision
Transformer(SIVT) for unsupervised generalizable multi-category industrial
visual anomaly detection and localization. The proposed SIVT first extracts
discriminatory features from pre-trained CNN as property descriptors. Then, the
self-induction vision Transformer is proposed to reconstruct the extracted
features in a self-supervisory fashion, where the auxiliary induction tokens
are additionally introduced to induct the semantics of the original signal.
Finally, the abnormal properties can be detected using the semantic feature
residual difference. We experimented with the SIVT on existing Mvtec AD
benchmarks, the results reveal that the proposed method can advance
state-of-the-art detection performance with an improvement of 2.8-6.3 in AUROC,
and 3.3-7.6 in AP.
Related papers
- Prior Normality Prompt Transformer for Multi-class Industrial Image Anomaly Detection [6.865429486202104]
We introduce Prior Normality Prompt Transformer (PNPT) for multi-class anomaly detection.
PNPT strategically incorporates normal semantics prompting to mitigate the "identical mapping" problem.
This entails integrating a prior normality prompt into the reconstruction process, yielding a dual-stream model.
arXiv Detail & Related papers (2024-06-17T13:10:04Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach [49.995833831087175]
This work proposes a novel method for generating generic Video-temporal PAs by inpainting a masked out region of an image.
In addition, we present a simple unified framework to detect real-world anomalies under the OCC setting.
Our method performs on par with other existing state-of-the-art PAs generation and reconstruction based methods under the OCC setting.
arXiv Detail & Related papers (2023-11-27T13:14:06Z) - LafitE: Latent Diffusion Model with Feature Editing for Unsupervised
Multi-class Anomaly Detection [12.596635603629725]
We develop a unified model to detect anomalies from objects belonging to multiple classes when only normal data is accessible.
We first explore the generative-based approach and investigate latent diffusion models for reconstruction.
We introduce a feature editing strategy that modifies the input feature space of the diffusion model to further alleviate identity shortcuts''
arXiv Detail & Related papers (2023-07-16T14:41:22Z) - Self-Supervised Masked Convolutional Transformer Block for Anomaly
Detection [122.4894940892536]
We present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level.
In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss.
arXiv Detail & Related papers (2022-09-25T04:56:10Z) - ADTR: Anomaly Detection Transformer with Feature Reconstruction [40.68590890351697]
Anomaly detection with only prior knowledge from normal samples attracts more attention.
Existing CNN-based pixel reconstruction approaches suffer from two concerns.
We propose Anomaly Detection TRansformer (ADTR) to apply a transformer to reconstruct pre-trained features.
arXiv Detail & Related papers (2022-09-05T08:01:27Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - Test-time Adaptation with Slot-Centric Models [63.981055778098444]
Slot-TTA is a semi-supervised scene decomposition model that at test time is adapted per scene through gradient descent on reconstruction or cross-view synthesis objectives.
We show substantial out-of-distribution performance improvements against state-of-the-art supervised feed-forward detectors, and alternative test-time adaptation methods.
arXiv Detail & Related papers (2022-03-21T17:59:50Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.