Positive Difference Distribution for Image Outlier Detection using
Normalizing Flows and Contrastive Data
- URL: http://arxiv.org/abs/2208.14024v2
- Date: Wed, 26 Apr 2023 07:46:01 GMT
- Title: Positive Difference Distribution for Image Outlier Detection using
Normalizing Flows and Contrastive Data
- Authors: Robert Schmier, Ullrich K\"othe, Christoph-Nikolas Straehle
- Abstract summary: Likelihoods learned by a generative model, e.g., a normalizing flow via standard log-likelihood training, perform poorly as an outlier score.
We propose to use an unlabelled auxiliary dataset and a probabilistic outlier score for outlier detection.
We show that this is equivalent to learning the normalized positive difference between the in-distribution and the contrastive feature density.
- Score: 2.9005223064604078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Detecting test data deviating from training data is a central problem for
safe and robust machine learning. Likelihoods learned by a generative model,
e.g., a normalizing flow via standard log-likelihood training, perform poorly
as an outlier score. We propose to use an unlabelled auxiliary dataset and a
probabilistic outlier score for outlier detection. We use a self-supervised
feature extractor trained on the auxiliary dataset and train a normalizing flow
on the extracted features by maximizing the likelihood on in-distribution data
and minimizing the likelihood on the contrastive dataset. We show that this is
equivalent to learning the normalized positive difference between the
in-distribution and the contrastive feature density. We conduct experiments on
benchmark datasets and compare to the likelihood, the likelihood ratio and
state-of-the-art anomaly detection methods.
Related papers
- Utilizing dataset affinity prediction in object detection to assess training data [4.508868068781057]
We show the benefits of the so-called dataset affinity score by automatically selecting samples from a heterogeneous pool of vehicle datasets.
The results show that object detectors can be trained on a significantly sparser set of training samples without losing detection accuracy.
arXiv Detail & Related papers (2023-11-16T10:45:32Z) - Robust Flow-based Conformal Inference (FCI) with Statistical Guarantee [4.821312633849745]
We develop a series of conformal inference methods, including building predictive sets and inferring outliers for complex and high-dimensional data.
We evaluate our method, robust flow-based conformal inference, on benchmark datasets.
arXiv Detail & Related papers (2022-05-22T04:17:30Z) - Nonlinear Isometric Manifold Learning for Injective Normalizing Flows [58.720142291102135]
We use isometries to separate manifold learning and density estimation.
We also employ autoencoders to design embeddings with explicit inverses that do not distort the probability distribution.
arXiv Detail & Related papers (2022-03-08T08:57:43Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Efficient remedies for outlier detection with variational autoencoders [8.80692072928023]
Likelihoods computed by deep generative models are a candidate metric for outlier detection with unlabeled data.
We show that a theoretically-grounded correction readily ameliorates a key bias with VAE likelihood estimates.
We also show that the variance of the likelihoods computed over an ensemble of VAEs also enables robust outlier detection.
arXiv Detail & Related papers (2021-08-19T16:00:58Z) - InFlow: Robust outlier detection utilizing Normalizing Flows [7.309919829856283]
We show that normalizing flows can reliably detect outliers including adversarial attacks.
Our approach does not require outlier data for training and we showcase the efficiency of our method for OOD detection.
arXiv Detail & Related papers (2021-06-10T08:42:50Z) - Evaluating State-of-the-Art Classification Models Against Bayes
Optimality [106.50867011164584]
We show that we can compute the exact Bayes error of generative models learned using normalizing flows.
We use our approach to conduct a thorough investigation of state-of-the-art classification models.
arXiv Detail & Related papers (2021-06-07T06:21:20Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Graph Embedding with Data Uncertainty [113.39838145450007]
spectral-based subspace learning is a common data preprocessing step in many machine learning pipelines.
Most subspace learning methods do not take into consideration possible measurement inaccuracies or artifacts that can lead to data with high uncertainty.
arXiv Detail & Related papers (2020-09-01T15:08:23Z) - Robust Variational Autoencoder for Tabular Data with Beta Divergence [0.0]
We propose a robust variational autoencoder with mixed categorical and continuous features.
Our results on the anomaly detection application for network traffic datasets demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2020-06-15T08:09:34Z) - Semi-Supervised Learning with Normalizing Flows [54.376602201489995]
FlowGMM is an end-to-end approach to generative semi supervised learning with normalizing flows.
We show promising results on a wide range of applications, including AG-News and Yahoo Answers text data.
arXiv Detail & Related papers (2019-12-30T17:36:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.