Inpainting Transformer for Anomaly Detection
- URL: http://arxiv.org/abs/2104.13897v1
- Date: Wed, 28 Apr 2021 17:27:44 GMT
- Title: Inpainting Transformer for Anomaly Detection
- Authors: Jonathan Pirnay, Keng Chai
- Abstract summary: Inpainting Transformer (InTra) is trained to inpaint covered patches in a large sequence of image patches.
InTra achieves better than state-of-the-art results on the MVTec AD dataset for detection and localization.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly detection in computer vision is the task of identifying images which
deviate from a set of normal images. A common approach is to train deep
convolutional autoencoders to inpaint covered parts of an image and compare the
output with the original image. By training on anomaly-free samples only, the
model is assumed to not being able to reconstruct anomalous regions properly.
For anomaly detection by inpainting we suggest it to be beneficial to
incorporate information from potentially distant regions. In particular we pose
anomaly detection as a patch-inpainting problem and propose to solve it with a
purely self-attention based approach discarding convolutions. The proposed
Inpainting Transformer (InTra) is trained to inpaint covered patches in a large
sequence of image patches, thereby integrating information across large regions
of the input image. When learning from scratch, InTra achieves better than
state-of-the-art results on the MVTec AD [1] dataset for detection and
localization.
Related papers
- LADMIM: Logical Anomaly Detection with Masked Image Modeling in Discrete Latent Space [0.0]
Masked image modeling is a self-supervised learning technique that predicts the feature representation of masked regions in an image.
We propose a novel approach that leverages the characteristics of MIM to detect logical anomalies effectively.
We evaluate the proposed method on the MVTecLOCO dataset, achieving an average AUC of 0.867.
arXiv Detail & Related papers (2024-10-14T07:50:56Z) - GeneralAD: Anomaly Detection Across Domains by Attending to Distorted Features [68.14842693208465]
GeneralAD is an anomaly detection framework designed to operate in semantic, near-distribution, and industrial settings.
We propose a novel self-supervised anomaly generation module that employs straightforward operations like noise addition and shuffling to patch features.
We extensively evaluated our approach on ten datasets, achieving state-of-the-art results in six and on-par performance in the remaining.
arXiv Detail & Related papers (2024-07-17T09:27:41Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - A Prototype-Based Neural Network for Image Anomaly Detection and Localization [10.830337829732915]
This paper proposes ProtoAD, a prototype-based neural network for image anomaly detection and localization.
First, the patch features of normal images are extracted by a deep network pre-trained on nature images.
ProtoAD achieves competitive performance compared to the state-of-the-art methods with a higher inference speed.
arXiv Detail & Related papers (2023-10-04T04:27:16Z) - Masked Transformer for image Anomaly Localization [14.455765147827345]
We propose a new model for image anomaly detection based on Vision Transformer architecture with patch masking.
We show that multi-resolution patches and their collective embeddings provide a large improvement in the model's performance.
The proposed model has been tested on popular anomaly detection datasets such as MVTec and head CT.
arXiv Detail & Related papers (2022-10-27T15:30:48Z) - AnoViT: Unsupervised Anomaly Detection and Localization with Vision
Transformer-based Encoder-Decoder [3.31490164885582]
We propose a vision transformer-based encoder-decoder model, named AnoViT, to reflect normal information by additionally learning the global relationship between image patches.
The proposed model performed better than the convolution-based model on three benchmark datasets.
arXiv Detail & Related papers (2022-03-21T09:01:37Z) - Self-Supervised Predictive Convolutional Attentive Block for Anomaly
Detection [97.93062818228015]
We propose to integrate the reconstruction-based functionality into a novel self-supervised predictive architectural building block.
Our block is equipped with a loss that minimizes the reconstruction error with respect to the masked area in the receptive field.
We demonstrate the generality of our block by integrating it into several state-of-the-art frameworks for anomaly detection on image and video.
arXiv Detail & Related papers (2021-11-17T13:30:31Z) - LocalTrans: A Multiscale Local Transformer Network for Cross-Resolution
Homography Estimation [52.63874513999119]
Cross-resolution image alignment is a key problem in multiscale giga photography.
Existing deep homography methods neglecting the explicit formulation of correspondences between them, which leads to degraded accuracy in cross-resolution challenges.
We propose a local transformer network embedded within a multiscale structure to explicitly learn correspondences between the multimodal inputs.
arXiv Detail & Related papers (2021-06-08T02:51:45Z) - A Hierarchical Transformation-Discriminating Generative Model for Few
Shot Anomaly Detection [93.38607559281601]
We devise a hierarchical generative model that captures the multi-scale patch distribution of each training image.
The anomaly score is obtained by aggregating the patch-based votes of the correct transformation across scales and image regions.
arXiv Detail & Related papers (2021-04-29T17:49:48Z) - CutPaste: Self-Supervised Learning for Anomaly Detection and
Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only.
We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations.
Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.