Anomaly Detection under Distribution Shift
- URL: http://arxiv.org/abs/2303.13845v2
- Date: Fri, 1 Sep 2023 14:42:54 GMT
- Title: Anomaly Detection under Distribution Shift
- Authors: Tri Cao, Jiawen Zhu, and Guansong Pang
- Abstract summary: Anomaly detection (AD) is a crucial machine learning task that aims to learn patterns from a set of normal training samples to identify abnormal samples in test data.
Most existing AD studies assume that the training and test data are drawn from the same data distribution, but the test data can have large distribution shifts.
We introduce a novel robust AD approach to diverse distribution shifts by minimizing the distribution gap between in-distribution and OOD normal samples in both the training and inference stages.
- Score: 24.094884041252044
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Anomaly detection (AD) is a crucial machine learning task that aims to learn
patterns from a set of normal training samples to identify abnormal samples in
test data. Most existing AD studies assume that the training and test data are
drawn from the same data distribution, but the test data can have large
distribution shifts arising in many real-world applications due to different
natural variations such as new lighting conditions, object poses, or background
appearances, rendering existing AD methods ineffective in such cases. In this
paper, we consider the problem of anomaly detection under distribution shift
and establish performance benchmarks on four widely-used AD and
out-of-distribution (OOD) generalization datasets. We demonstrate that simple
adaptation of state-of-the-art OOD generalization methods to AD settings fails
to work effectively due to the lack of labeled anomaly data. We further
introduce a novel robust AD approach to diverse distribution shifts by
minimizing the distribution gap between in-distribution and OOD normal samples
in both the training and inference stages in an unsupervised way. Our extensive
empirical results on the four datasets show that our approach substantially
outperforms state-of-the-art AD methods and OOD generalization methods on data
with various distribution shifts, while maintaining the detection accuracy on
in-distribution data. Code and data are available at
https://github.com/mala-lab/ADShift.
Related papers
- Out-of-Distribution Detection with a Single Unconditional Diffusion Model [54.15132801131365]
Out-of-distribution (OOD) detection is a critical task in machine learning that seeks to identify abnormal samples.
Traditionally, unsupervised methods utilize a deep generative model for OOD detection.
This paper explores whether a single model can perform OOD detection across diverse tasks.
arXiv Detail & Related papers (2024-05-20T08:54:03Z) - Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts [25.629973843455495]
Generalist Anomaly Detection (GAD) aims to train one single detection model that can generalize to detect anomalies in diverse datasets from different application domains without further training on the target data.
We introduce a novel approach that learns an in-context residual learning model for GAD, termed InCTRL.
InCTRL is the best performer and significantly outperforms state-of-the-art competing methods.
arXiv Detail & Related papers (2024-03-11T08:07:46Z) - Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets.
We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z) - Understanding normalization in contrastive representation learning and out-of-distribution detection [0.0]
We propose a simple method based on contrastive learning, which incorporates out-of-distribution data by discriminating against normal samples in the contrastive layer space.
Our approach can be applied flexibly as an outlier exposure (OE) approach, or as a fully self-supervised learning approach.
The high-quality features learned through contrastive learning consistently enhance performance in OE scenarios, even when the available out-of-distribution dataset is not diverse enough.
arXiv Detail & Related papers (2023-12-23T16:05:47Z) - Invariant Anomaly Detection under Distribution Shifts: A Causal
Perspective [6.845698872290768]
Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples.
Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down.
We attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts.
arXiv Detail & Related papers (2023-12-21T23:20:47Z) - A Generic Machine Learning Framework for Fully-Unsupervised Anomaly
Detection with Contaminated Data [0.0]
We introduce a framework for a fully unsupervised refinement of contaminated training data for AD tasks.
The framework is generic and can be applied to any residual-based machine learning model.
We show its clear superiority over the naive approach of training with contaminated data without refinement.
arXiv Detail & Related papers (2023-08-25T12:47:59Z) - DIVERSIFY: A General Framework for Time Series Out-of-distribution
Detection and Generalization [58.704753031608625]
Time series is one of the most challenging modalities in machine learning research.
OOD detection and generalization on time series tend to suffer due to its non-stationary property.
We propose DIVERSIFY, a framework for OOD detection and generalization on dynamic distributions of time series.
arXiv Detail & Related papers (2023-08-04T12:27:11Z) - Anomaly Detection with Score Distribution Discrimination [4.468952886990851]
We propose to optimize the anomaly scoring function from the view of score distribution.
We design a novel loss function called Overlap loss that minimizes the overlap area between the score distributions of normal and abnormal samples.
arXiv Detail & Related papers (2023-06-26T03:32:57Z) - Self-Trained One-class Classification for Unsupervised Anomaly Detection [56.35424872736276]
Anomaly detection (AD) has various applications across domains, from manufacturing to healthcare.
In this work, we focus on unsupervised AD problems whose entire training data are unlabeled and may contain both normal and anomalous samples.
To tackle this problem, we build a robust one-class classification framework via data refinement.
We show that our method outperforms state-of-the-art one-class classification method by 6.3 AUC and 12.5 average precision.
arXiv Detail & Related papers (2021-06-11T01:36:08Z) - WILDS: A Benchmark of in-the-Wild Distribution Shifts [157.53410583509924]
Distribution shifts can substantially degrade the accuracy of machine learning systems deployed in the wild.
We present WILDS, a curated collection of 8 benchmark datasets that reflect a diverse range of distribution shifts.
We show that standard training results in substantially lower out-of-distribution than in-distribution performance.
arXiv Detail & Related papers (2020-12-14T11:14:56Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.