Towards Total Recall in Industrial Anomaly Detection
- URL: http://arxiv.org/abs/2106.08265v1
- Date: Tue, 15 Jun 2021 16:27:02 GMT
- Title: Towards Total Recall in Industrial Anomaly Detection
- Authors: Karsten Roth, Latha Pemula, Joaquin Zepeda, Bernhard Sch\"olkopf,
Thomas Brox, Peter Gehler
- Abstract summary: We propose PatchCore to solve the problem of spotting defective parts in images.
PatchCore offers competitive inference times while achieving state-of-the-art performance for both detection and localization.
On the standard dataset MVTec AD, PatchCore achieves an image-level anomaly detection AUROC score of $99.1%$, more than halving the error compared to the next best competitor.
- Score: 38.4839780454375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Being able to spot defective parts is a critical component in large-scale
industrial manufacturing. A particular challenge that we address in this work
is the cold-start problem: fit a model using nominal (non-defective) example
images only. While handcrafted solutions per class are possible, the goal is to
build systems that work well simultaneously on many different tasks
automatically. The best peforming approaches combine embeddings from ImageNet
models with an outlier detection model. In this paper, we extend on this line
of work and propose PatchCore, which uses a maximally representative memory
bank of nominal patch-features. PatchCore offers competitive inference times
while achieving state-of-the-art performance for both detection and
localization. On the standard dataset MVTec AD, PatchCore achieves an
image-level anomaly detection AUROC score of $99.1\%$, more than halving the
error compared to the next best competitor. We further report competitive
results on two additional datasets and also find competitive results in the few
samples regime.
Related papers
- AnomalyDINO: Boosting Patch-based Few-shot Anomaly Detection with DINOv2 [16.69402464709241]
We adapt DINOv2 for one-shot and few-shot anomaly detection, with a focus on industrial applications.
Our proposed vision-only approach, AnomalyDINO, is based on patch similarities and enables both image-level anomaly prediction and pixel-level anomaly segmentation.
Despite its simplicity, AnomalyDINO achieves state-of-the-art results in one- and few-shot anomaly detection (e.g., pushing the one-shot performance on MVTec-AD from an AUROC of 93.1% to 96.6%).
arXiv Detail & Related papers (2024-05-23T13:15:13Z) - Small Effect Sizes in Malware Detection? Make Harder Train/Test Splits! [51.668411293817464]
Industry practitioners care about small improvements in malware detection accuracy because their models are deployed to hundreds of millions of machines.
Academic research is often restrained to public datasets on the order of ten thousand samples.
We devise an approach to generate a benchmark of difficulty from a pool of available samples.
arXiv Detail & Related papers (2023-12-25T21:25:55Z) - UniMatch: A Unified User-Item Matching Framework for the Multi-purpose
Merchant Marketing [27.459774494479227]
We present a unified user-item matching framework to simultaneously conduct item recommendation and user targeting with just one model.
Our framework results in significant performance gains in comparison with the state-of-the-art methods, with greatly reduced cost on computing resources and daily maintenance.
arXiv Detail & Related papers (2023-07-19T13:49:35Z) - Single-Stage Visual Relationship Learning using Conditional Queries [60.90880759475021]
TraCQ is a new formulation for scene graph generation that avoids the multi-task learning problem and the entity pair distribution.
We employ a DETR-based encoder-decoder conditional queries to significantly reduce the entity label space as well.
Experimental results show that TraCQ not only outperforms existing single-stage scene graph generation methods, it also beats many state-of-the-art two-stage methods on the Visual Genome dataset.
arXiv Detail & Related papers (2023-06-09T06:02:01Z) - Learning to Identify Drilling Defects in Turbine Blades with Single
Stage Detectors [15.842163335920954]
We propose a model based on Retina drilling defects in X-ray images of turbine blades.
The application is challenging due to the image resolutions in which defects are very small and hardly captured by the commonly used anchor sizes.
We validate the model with $3$-fold cross-validation, showing a very high accuracy in identifying images with defects.
arXiv Detail & Related papers (2022-08-08T18:44:51Z) - Enforcing Mutual Consistency of Hard Regions for Semi-supervised Medical
Image Segmentation [68.9233942579956]
We propose a novel mutual consistency network (MC-Net+) to exploit the unlabeled hard regions for semi-supervised medical image segmentation.
The MC-Net+ model is motivated by the observation that deep models trained with limited annotations are prone to output highly uncertain and easily mis-classified predictions.
We compare the segmentation results of the MC-Net+ with five state-of-the-art semi-supervised approaches on three public medical datasets.
arXiv Detail & Related papers (2021-09-21T04:47:42Z) - Efficient Person Search: An Anchor-Free Approach [86.45858994806471]
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images.
To achieve this goal, state-of-the-art models typically add a re-id branch upon two-stage detectors like Faster R-CNN.
In this work, we present an anchor-free approach to efficiently tackling this challenging task, by introducing the following dedicated designs.
arXiv Detail & Related papers (2021-09-01T07:01:33Z) - P-WAE: Generalized Patch-Wasserstein Autoencoder for Anomaly Screening [17.24628770042803]
We propose a novel Patch-wise Wasserstein AutoEncoder (P-WAE) architecture to alleviate those challenges.
In particular, a patch-wise variational inference model coupled with solving the jigsaw puzzle is designed.
Comprehensive experiments, conducted on the MVTec AD dataset, demonstrate the superior performance of our propo
arXiv Detail & Related papers (2021-08-09T05:31:45Z) - When Liebig's Barrel Meets Facial Landmark Detection: A Practical Model [87.25037167380522]
We propose a model that is accurate, robust, efficient, generalizable, and end-to-end trainable.
In order to achieve a better accuracy, we propose two lightweight modules.
DQInit dynamically initializes the queries of decoder from the inputs, enabling the model to achieve as good accuracy as the ones with multiple decoder layers.
QAMem is designed to enhance the discriminative ability of queries on low-resolution feature maps by assigning separate memory values to each query rather than a shared one.
arXiv Detail & Related papers (2021-05-27T13:51:42Z) - Evaluation of Model Selection for Kernel Fragment Recognition in Corn
Silage [25.54556810106467]
We investigate a number of state of the art CNN models for the task of measuring kernel fragmentation in harvested corn silage.
We show improvements in Average Precision at an Intersection over Union of 0.5 of up to 20 percentage points while also decreasing inference time in comparison to previously published work.
arXiv Detail & Related papers (2020-04-01T08:56:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.