Related papers: ISSTAD: Incremental Self-Supervised Learning Based on Transformer for Anomaly Detection and Localization

ISSTAD: Incremental Self-Supervised Learning Based on Transformer for Anomaly Detection and Localization

URL: http://arxiv.org/abs/2303.17354v4
Date: Fri, 28 Apr 2023 22:10:42 GMT
Title: ISSTAD: Incremental Self-Supervised Learning Based on Transformer for Anomaly Detection and Localization
Authors: Wenping Jin, Fei Guo, Li Zhu
Abstract summary: We introduce a novel approach based on the Transformer backbone network. We train a Masked Autoencoder (MAE) model solely on normal images. In the subsequent stage, we apply pixel-level data augmentation techniques to generate corrupted normal images. This process allows the model to learn how to repair corrupted regions and classify the status of each pixel.
Score: 12.975540251326683
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the realm of machine learning, the study of anomaly detection and localization within image data has gained substantial traction, particularly for practical applications such as industrial defect detection. While the majority of existing methods predominantly use Convolutional Neural Networks (CNN) as their primary network architecture, we introduce a novel approach based on the Transformer backbone network. Our method employs a two-stage incremental learning strategy. During the first stage, we train a Masked Autoencoder (MAE) model solely on normal images. In the subsequent stage, we apply pixel-level data augmentation techniques to generate corrupted normal images and their corresponding pixel labels. This process allows the model to learn how to repair corrupted regions and classify the status of each pixel. Ultimately, the model generates a pixel reconstruction error matrix and a pixel anomaly probability matrix. These matrices are then combined to produce an anomaly scoring matrix that effectively detects abnormal regions. When benchmarked against several state-of-the-art CNN-based methods, our approach exhibits superior performance on the MVTec AD dataset, achieving an impressive 97.6% AUC.

Related papers

Masked Autoencoder Self Pre-Training for Defect Detection in Microelectronics [0.7456526005219319]
We propose a vision transformer (ViT) pre-training framework for defect detection in microelectronics based on masked autoencoders (MAE) We perform pre-training and defect detection using a dataset of less than 10.000 scanning acoustic microscopy (SAM) images labelled using transient thermal analysis (TTA) Our approach leads to substantial performance gains compared to a) supervised ViT, b) ViT pre-trained on natural image datasets, and c) state-of-the-art CNN-based defect detection models used in the literature.
arXiv Detail & Related papers (2025-04-14T09:25:50Z)
LADMIM: Logical Anomaly Detection with Masked Image Modeling in Discrete Latent Space [0.0]
Masked image modeling is a self-supervised learning technique that predicts the feature representation of masked regions in an image. We propose a novel approach that leverages the characteristics of MIM to detect logical anomalies effectively. We evaluate the proposed method on the MVTecLOCO dataset, achieving an average AUC of 0.867.
arXiv Detail & Related papers (2024-10-14T07:50:56Z)
DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection [55.48770333927732]
We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection. It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor. Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2023-12-11T18:38:28Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
A Prototype-Based Neural Network for Image Anomaly Detection and Localization [10.830337829732915]
This paper proposes ProtoAD, a prototype-based neural network for image anomaly detection and localization. First, the patch features of normal images are extracted by a deep network pre-trained on nature images. ProtoAD achieves competitive performance compared to the state-of-the-art methods with a higher inference speed.
arXiv Detail & Related papers (2023-10-04T04:27:16Z)
Pixel-Inconsistency Modeling for Image Manipulation Localization [59.968362815126326]
Digital image forensics plays a crucial role in image authentication and manipulation localization. This paper presents a generalized and robust manipulation localization model through the analysis of pixel inconsistency artifacts. Experiments show that our method successfully extracts inherent pixel-inconsistency forgery fingerprints.
arXiv Detail & Related papers (2023-09-30T02:54:51Z)
FRE: A Fast Method For Anomaly Detection And Segmentation [5.0468312081378475]
This paper presents a principled approach for solving the visual anomaly detection and segmentation problem. We propose the application of linear statistical dimensionality reduction techniques on the intermediate features produced by a pretrained DNN on the training data. We show that the emphfeature reconstruction error (FRE), which is the $ell$-norm of the difference between the original feature in the high-dimensional space and the pre-image of its low-dimensional reduced embedding, is extremely effective for anomaly detection.
arXiv Detail & Related papers (2022-11-23T01:03:20Z)
Self-Supervised Masked Convolutional Transformer Block for Anomaly Detection [122.4894940892536]
We present a novel self-supervised masked convolutional transformer block (SSMCTB) that comprises the reconstruction-based functionality at a core architectural level. In this work, we extend our previous self-supervised predictive convolutional attentive block (SSPCAB) with a 3D masked convolutional layer, a transformer for channel-wise attention, as well as a novel self-supervised objective based on Huber loss.
arXiv Detail & Related papers (2022-09-25T04:56:10Z)
Unsupervised Industrial Anomaly Detection via Pattern Generative and Contrastive Networks [6.393288885927437]
We propose Vision Transformer based (VIT) unsupervised anomaly detection network. It utilizes a hierarchical task learning and human experience to enhance its interpretability. Our method achieves 99.8% AUC, which surpasses previous state-of-the-art methods.
arXiv Detail & Related papers (2022-07-20T10:09:53Z)
One-Stage Deep Edge Detection Based on Dense-Scale Feature Fusion and Pixel-Level Imbalance Learning [5.370848116287344]
We propose a one-stage neural network model that can generate high-quality edge images without postprocessing. The proposed model adopts a classic encoder-decoder framework in which a pre-trained neural model is used as the encoder. We propose a new loss function that addresses the pixel-level imbalance in the edge image.
arXiv Detail & Related papers (2022-03-17T15:26:00Z)
A Hierarchical Transformation-Discriminating Generative Model for Few Shot Anomaly Detection [93.38607559281601]
We devise a hierarchical generative model that captures the multi-scale patch distribution of each training image. The anomaly score is obtained by aggregating the patch-based votes of the correct transformation across scales and image regions.
arXiv Detail & Related papers (2021-04-29T17:49:48Z)
CutPaste: Self-Supervised Learning for Anomaly Detection and Localization [59.719925639875036]
We propose a framework for building anomaly detectors using normal training data only. We first learn self-supervised deep representations and then build a generative one-class classifier on learned representations. Our empirical study on MVTec anomaly detection dataset demonstrates the proposed algorithm is general to be able to detect various types of real-world defects.
arXiv Detail & Related papers (2021-04-08T19:04:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.