DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection
- URL: http://arxiv.org/abs/2312.06607v1
- Date: Mon, 11 Dec 2023 18:38:28 GMT
- Title: DiAD: A Diffusion-based Framework for Multi-class Anomaly Detection
- Authors: Haoyang He, Jiangning Zhang, Hongxu Chen, Xuhai Chen, Zhishan Li, Xu
Chen, Yabiao Wang, Chengjie Wang, Lei Xie
- Abstract summary: We propose a Difusion-based Anomaly Detection (DiAD) framework for multi-class anomaly detection.
It consists of a pixel-space autoencoder, a latent-space Semantic-Guided (SG) network with a connection to the stable diffusion's denoising network, and a feature-space pre-trained feature extractor.
Experiments on MVTec-AD and VisA datasets demonstrate the effectiveness of our approach.
- Score: 55.48770333927732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reconstruction-based approaches have achieved remarkable outcomes in anomaly
detection. The exceptional image reconstruction capabilities of recently
popular diffusion models have sparked research efforts to utilize them for
enhanced reconstruction of anomalous images. Nonetheless, these methods might
face challenges related to the preservation of image categories and pixel-wise
structural integrity in the more practical multi-class setting. To solve the
above problems, we propose a Difusion-based Anomaly Detection (DiAD) framework
for multi-class anomaly detection, which consists of a pixel-space autoencoder,
a latent-space Semantic-Guided (SG) network with a connection to the stable
diffusion's denoising network, and a feature-space pre-trained feature
extractor. Firstly, The SG network is proposed for reconstructing anomalous
regions while preserving the original image's semantic information. Secondly,
we introduce Spatial-aware Feature Fusion (SFF) block to maximize
reconstruction accuracy when dealing with extensively reconstructed areas.
Thirdly, the input and reconstructed images are processed by a pre-trained
feature extractor to generate anomaly maps based on features extracted at
different scales. Experiments on MVTec-AD and VisA datasets demonstrate the
effectiveness of our approach which surpasses the state-of-the-art methods,
e.g., achieving 96.8/52.6 and 97.2/99.0 (AUROC/AP) for localization and
detection respectively on multi-class MVTec-AD dataset. Code will be available
at https://lewandofskee.github.io/projects/diad.
Related papers
- LADMIM: Logical Anomaly Detection with Masked Image Modeling in Discrete Latent Space [0.0]
Masked image modeling is a self-supervised learning technique that predicts the feature representation of masked regions in an image.
We propose a novel approach that leverages the characteristics of MIM to detect logical anomalies effectively.
We evaluate the proposed method on the MVTecLOCO dataset, achieving an average AUC of 0.867.
arXiv Detail & Related papers (2024-10-14T07:50:56Z) - Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities [88.398085358514]
Contrastive Deepfake Embeddings (CoDE) is a novel embedding space specifically designed for deepfake detection.
CoDE is trained via contrastive learning by additionally enforcing global-local similarities.
arXiv Detail & Related papers (2024-07-29T18:00:10Z) - DA-HFNet: Progressive Fine-Grained Forgery Image Detection and Localization Based on Dual Attention [12.36906630199689]
We construct a DA-HFNet forged image dataset guided by text or image-assisted GAN and Diffusion model.
Our goal is to utilize a hierarchical progressive network to capture forged artifacts at different scales for detection and localization.
arXiv Detail & Related papers (2024-06-03T16:13:33Z) - Multi-feature Reconstruction Network using Crossed-mask Restoration for Unsupervised Industrial Anomaly Detection [4.742650815342744]
Unsupervised anomaly detection is of great significance for quality inspection in industrial manufacturing.
We propose a multi-feature reconstruction network, MFRNet, using crossed-mask restoration in this paper.
Our method is highly competitive with or significantly outperforms other state-of-the-arts on four public available datasets and one self-made dataset.
arXiv Detail & Related papers (2024-04-20T05:13:56Z) - TransFusion -- A Transparency-Based Diffusion Model for Anomaly Detection [2.7855886538423182]
We propose a novel discriminative anomaly detection method that achieves state-of-the-art performance on two datasets.
TransFusion achieves state-of-the-art performance on both the VisA and the MVTec AD datasets, with an image-level AUROC of 98.5% and 99.2%, respectively.
arXiv Detail & Related papers (2023-11-16T16:23:11Z) - Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components.
CNNs are used to augment the local texture information of coarse priors.
DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - ReContrast: Domain-Specific Anomaly Detection via Contrastive
Reconstruction [29.370142078092375]
Most advanced unsupervised anomaly detection (UAD) methods rely on modeling feature representations of frozen encoder networks pre-trained on large-scale datasets.
We propose a novel epistemic UAD method, namely ReContrast, which optimize the entire network to reduce biases towards the pre-trained image domain.
We conduct experiments across two popular industrial defect detection benchmarks and three medical image UAD tasks, which shows our superiority over current state-of-the-art methods.
arXiv Detail & Related papers (2023-06-05T05:21:15Z) - PC-GANs: Progressive Compensation Generative Adversarial Networks for
Pan-sharpening [50.943080184828524]
We propose a novel two-step model for pan-sharpening that sharpens the MS image through the progressive compensation of the spatial and spectral information.
The whole model is composed of triple GANs, and based on the specific architecture, a joint compensation loss function is designed to enable the triple GANs to be trained simultaneously.
arXiv Detail & Related papers (2022-07-29T03:09:21Z) - Self-Supervised Predictive Convolutional Attentive Block for Anomaly
Detection [97.93062818228015]
We propose to integrate the reconstruction-based functionality into a novel self-supervised predictive architectural building block.
Our block is equipped with a loss that minimizes the reconstruction error with respect to the masked area in the receptive field.
We demonstrate the generality of our block by integrating it into several state-of-the-art frameworks for anomaly detection on image and video.
arXiv Detail & Related papers (2021-11-17T13:30:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.