ISSTAD: Incremental Self-Supervised Learning Based on Transformer for
Anomaly Detection and Localization
- URL: http://arxiv.org/abs/2303.17354v4
- Date: Fri, 28 Apr 2023 22:10:42 GMT
- Title: ISSTAD: Incremental Self-Supervised Learning Based on Transformer for
Anomaly Detection and Localization
- Authors: Wenping Jin, Fei Guo, Li Zhu
- Abstract summary: We introduce a novel approach based on the Transformer backbone network.
We train a Masked Autoencoder (MAE) model solely on normal images.
In the subsequent stage, we apply pixel-level data augmentation techniques to generate corrupted normal images.
This process allows the model to learn how to repair corrupted regions and classify the status of each pixel.
- Score: 12.975540251326683
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the realm of machine learning, the study of anomaly detection and
localization within image data has gained substantial traction, particularly
for practical applications such as industrial defect detection. While the
majority of existing methods predominantly use Convolutional Neural Networks
(CNN) as their primary network architecture, we introduce a novel approach
based on the Transformer backbone network. Our method employs a two-stage
incremental learning strategy. During the first stage, we train a Masked
Autoencoder (MAE) model solely on normal images. In the subsequent stage, we
apply pixel-level data augmentation techniques to generate corrupted normal
images and their corresponding pixel labels. This process allows the model to
learn how to repair corrupted regions and classify the status of each pixel.
Ultimately, the model generates a pixel reconstruction error matrix and a pixel
anomaly probability matrix. These matrices are then combined to produce an
anomaly scoring matrix that effectively detects abnormal regions. When
benchmarked against several state-of-the-art CNN-based methods, our approach
exhibits superior performance on the MVTec AD dataset, achieving an impressive
97.6% AUC.
Related papers
- LADMIM: Logical Anomaly Detection with Masked Image Modeling in Discrete Latent Space [0.0]
Masked image modeling is a self-supervised learning technique that predicts the feature representation of masked regions in an image.
We propose a novel approach that leverages the characteristics of MIM to detect logical anomalies effectively.
We evaluate the proposed method on the MVTecLOCO dataset, achieving an average AUC of 0.867.
arXiv Detail & Related papers (2024-10-14T07:50:56Z) - DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - A Prototype-Based Neural Network for Image Anomaly Detection and Localization [10.830337829732915]
This paper proposes ProtoAD, a prototype-based neural network for image anomaly detection and localization.
First, the patch features of normal images are extracted by a deep network pre-trained on nature images.
ProtoAD achieves competitive performance compared to the state-of-the-art methods with a higher inference speed.
arXiv Detail & Related papers (2023-10-04T04:27:16Z) - Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization.
This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts.
Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z) - FRE: A Fast Method For Anomaly Detection And Segmentation [5.0468312081378475]
This paper presents a principled approach for solving the visual anomaly detection and segmentation problem.
We propose the application of linear statistical dimensionality reduction techniques on the intermediate features produced by a pretrained DNN on the training data.
We show that the emphfeature reconstruction error (FRE), which is the $ell$-norm of the difference between the original feature in the high-dimensional space and the pre-image of its low-dimensional reduced embedding, is extremely effective for anomaly detection.
arXiv Detail & Related papers (2022-11-23T01:03:20Z) - Self-Supervised Masked Convolutional Transformer Block for Anomaly
Detection [122.4894940892536]
We present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level.
In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss.
arXiv Detail & Related papers (2022-09-25T04:56:10Z) - Unsupervised Industrial Anomaly Detection via Pattern Generative and Contrastive Networks [6.393288885927437]
We propose Vision Transformer based (VIT) unsupervised anomaly detection network.
It utilizes a hierarchical task learning and human experience to enhance its interpretability.
Our method achieves 99.8% AUC, which surpasses previous state-of-the-art methods.
arXiv Detail & Related papers (2022-07-20T10:09:53Z) - One-Stage Deep Edge Detection Based on Dense-Scale Feature Fusion and
Pixel-Level Imbalance Learning [5.370848116287344]
We propose a one-stage neural network model that can generate high-quality edge images without postprocessing.
The proposed model adopts a classic encoder-decoder framework in which a pre-trained neural model is used as the encoder.
We propose a new loss function that addresses the pixel-level imbalance in the edge image.
arXiv Detail & Related papers (2022-03-17T15:26:00Z) - A Hierarchical Transformation-Discriminating Generative Model for Few
Shot Anomaly Detection [93.38607559281601]
We devise a hierarchical generative model that captures the multi-scale patch distribution of each training image.
The anomaly score is obtained by aggregating the patch-based votes of the correct transformation across scales and image regions.
arXiv Detail & Related papers (2021-04-29T17:49:48Z) - CutPaste: Self-Supervised Learning for Anomaly Detection and
Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only.
We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations.
Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.