Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers
- URL: http://arxiv.org/abs/2306.03492v5
- Date: Thu, 11 Jul 2024 14:17:34 GMT
- Title: Industrial Anomaly Detection and Localization Using Weakly-Supervised Residual Transformers
- Authors: Hanxi Li, Jingqi Wu, Lin Yuanbo Wu, Hao Chen, Deyin Liu, Mingwen Wang, Peng Wang,
- Abstract summary: "Weakly-supervised RESidual Transformer" aims to achieve high AD accuracy while minimizing the need for extensive annotations.
We design a residual-based transformer model, termed "Positional Fast Anomaly Residuals" (PosFAR)
On the benchmark dataset MVTec-AD, our proposed WeakREST framework achieves a remarkable Average Precision (AP) of 83.0%.
- Score: 7.487975220416574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in industrial Anomaly Detection (AD) have shown that incorporating a few anomalous samples during training can significantly boost accuracy. However, this performance improvement comes at a high cost: extensive annotation efforts, which are often impractical in real-world applications. In this work, we propose a novel framework called "Weakly-supervised RESidual Transformer" (WeakREST), which aims to achieve high AD accuracy while minimizing the need for extensive annotations. First, we reformulate the pixel-wise anomaly localization task into a block-wise classification problem. By shifting the focus to block-wise level, we can drastically reduce the amount of required annotations without compromising on the accuracy of anomaly detection Secondly, we design a residual-based transformer model, termed "Positional Fast Anomaly Residuals" (PosFAR), to classify the image blocks in real time. We further propose to label the anomalous regions using only bounding boxes or image tags as weaker labels, leading to a semi-supervised learning setting. On the benchmark dataset MVTec-AD, our proposed WeakREST framework achieves a remarkable Average Precision (AP) of 83.0%, significantly outperforming the previous best result of 75.8% in the unsupervised setting. In the supervised AD setting, WeakREST further improves performance, attaining an AP of 87.6% compared to the previous best of 78.6%. Notably, even when utilizing weaker labels based on bounding boxes, WeakREST surpasses recent leading methods that rely on pixel-wise supervision, achieving an AP of 87.1% against the prior best of 78.6% on MVTec-AD. This precision advantage is also consistently observed on other well-known AD datasets, such as BTAD and KSDD2.
Related papers
- Co-Fix3D: Enhancing 3D Object Detection with Collaborative Refinement [33.773644087620745]
Co-Fix3D employs a collaborative hybrid multi-stage parallel query generation mechanism for BEV representations.
Our method incorporates the Local-Global Feature Enhancement (LGE) module, which refines BEV features to more effectively highlight weak positive samples.
Co-Fix3D achieves superior results on the stringent nuScenes benchmark.
arXiv Detail & Related papers (2024-08-15T07:56:02Z) - Enhancing Infrared Small Target Detection Robustness with Bi-Level
Adversarial Framework [61.34862133870934]
We propose a bi-level adversarial framework to promote the robustness of detection in the presence of distinct corruptions.
Our scheme remarkably improves 21.96% IOU across a wide array of corruptions and notably promotes 4.97% IOU on the general benchmark.
arXiv Detail & Related papers (2023-09-03T06:35:07Z) - REB: Reducing Biases in Representation for Industrial Anomaly Detection [16.550844182346314]
We propose Reducing Biases (REB) in representation by considering the domain bias and building a self-supervised learning task for better domain adaption.
We also propose a local-density KNN (LDKNN) to reduce the local density bias in the feature space and obtain effective anomaly detection.
The proposed REB method achieves a promising result of 99.5% Im.AUROC on the widely used MVTec AD, with smaller backbone networks such as Vgg11 and Resnet18.
arXiv Detail & Related papers (2023-08-24T05:32:29Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Align-DETR: Improving DETR with Simple IoU-aware BCE loss [32.13866392998818]
We propose a metric, recall of best-regressed samples, to quantitively evaluate the misalignment problem.
The proposed loss, IA-BCE, guides the training of DETR to build a strong correlation between classification score and localization precision.
To overcome the dramatic decrease in sample quality induced by the sparsity of queries, we introduce a prime sample weighting mechanism.
arXiv Detail & Related papers (2023-04-15T10:24:51Z) - Diffusion Denoising Process for Perceptron Bias in Out-of-distribution
Detection [67.49587673594276]
We introduce a new perceptron bias assumption that suggests discriminator models are more sensitive to certain features of the input, leading to the overconfidence problem.
We demonstrate that the diffusion denoising process (DDP) of DMs serves as a novel form of asymmetric, which is well-suited to enhance the input and mitigate the overconfidence problem.
Our experiments on CIFAR10, CIFAR100, and ImageNet show that our method outperforms SOTA approaches.
arXiv Detail & Related papers (2022-11-21T08:45:08Z) - To be Critical: Self-Calibrated Weakly Supervised Learning for Salient
Object Detection [95.21700830273221]
Weakly-supervised salient object detection (WSOD) aims to develop saliency models using image-level annotations.
We propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions.
We prove that even a much smaller dataset with well-matched annotations can facilitate models to achieve better performance as well as generalizability.
arXiv Detail & Related papers (2021-09-04T02:45:22Z) - Towards Reducing Labeling Cost in Deep Object Detection [61.010693873330446]
We propose a unified framework for active learning, that considers both the uncertainty and the robustness of the detector.
Our method is able to pseudo-label the very confident predictions, suppressing a potential distribution drift.
arXiv Detail & Related papers (2021-06-22T16:53:09Z) - SADet: Learning An Efficient and Accurate Pedestrian Detector [68.66857832440897]
This paper proposes a series of systematic optimization strategies for the detection pipeline of one-stage detector.
It forms a single shot anchor-based detector (SADet) for efficient and accurate pedestrian detection.
Though structurally simple, it presents state-of-the-art result and real-time speed of $20$ FPS for VGA-resolution images.
arXiv Detail & Related papers (2020-07-26T12:32:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.