Probabilistic Outlier Detection and Generation
- URL: http://arxiv.org/abs/2012.12394v1
- Date: Tue, 22 Dec 2020 22:42:56 GMT
- Title: Probabilistic Outlier Detection and Generation
- Authors: Stefano Giovanni Rizzo, Linsey Pang, Yixian Chen, Sanjay Chawla
- Abstract summary: A Wasserstein double autoencoder is used to both detect and generate inliers and outliers.
WALDO is evaluated on classical data sets for detection accuracy and robustness.
- Score: 11.35109169978955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A new method for outlier detection and generation is introduced by lifting
data into the space of probability distributions which are not analytically
expressible, but from which samples can be drawn using a neural generator.
Given a mixture of unknown latent inlier and outlier distributions, a
Wasserstein double autoencoder is used to both detect and generate inliers and
outliers. The proposed method, named WALDO (Wasserstein Autoencoder for
Learning the Distribution of Outliers), is evaluated on classical data sets
including MNIST, CIFAR10 and KDD99 for detection accuracy and robustness. We
give an example of outlier detection on a real retail sales data set and an
example of outlier generation for simulating intrusion attacks. However we
foresee many application scenarios where WALDO can be used. To the best of our
knowledge this is the first work that studies both outlier detection and
generation together.
Related papers
- Beyond the Known: Adversarial Autoencoders in Novelty Detection [2.7486022583843233]
In novelty detection, the goal is to decide if a new data point should be categorized as an inlier or an outlier.
We use a similar framework but with a lightweight deep network, and we adopt a probabilistic score with reconstruction error.
Our results indicate that our approach is effective at learning the target class, and it outperforms recent state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2024-04-06T00:04:19Z) - Diversified Outlier Exposure for Out-of-Distribution Detection via
Informative Extrapolation [110.34982764201689]
Out-of-distribution (OOD) detection is important for deploying reliable machine learning models on real-world applications.
Recent advances in outlier exposure have shown promising results on OOD detection via fine-tuning model with informatively sampled auxiliary outliers.
We propose a novel framework, namely, Diversified Outlier Exposure (DivOE), for effective OOD detection via informative extrapolation based on the given auxiliary outliers.
arXiv Detail & Related papers (2023-10-21T07:16:09Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - Normalizing Flow based Feature Synthesis for Outlier-Aware Object
Detection [8.249143014271887]
General-purpose object detectors like Faster R-CNN are prone to providing overconfident predictions for outlier objects.
We propose a novel outlier-aware object detection framework that distinguishes outliers from inlier objects.
Our approach significantly outperforms the state-of-the-art for outlier-aware object detection on both image and video datasets.
arXiv Detail & Related papers (2023-02-01T13:12:00Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Efficient remedies for outlier detection with variational autoencoders [8.80692072928023]
Likelihoods computed by deep generative models are a candidate metric for outlier detection with unlabeled data.
We show that a theoretically-grounded correction readily ameliorates a key bias with VAE likelihood estimates.
We also show that the variance of the likelihoods computed over an ensemble of VAEs also enables robust outlier detection.
arXiv Detail & Related papers (2021-08-19T16:00:58Z) - Robust Out-of-Distribution Detection on Deep Probabilistic Generative
Models [0.06372261626436676]
Out-of-distribution (OOD) detection is an important task in machine learning systems.
Deep probabilistic generative models facilitate OOD detection by estimating the likelihood of a data sample.
We propose a new detection metric that operates without outlier exposure.
arXiv Detail & Related papers (2021-06-15T06:36:10Z) - DAAIN: Detection of Anomalous and Adversarial Input using Normalizing
Flows [52.31831255787147]
We introduce a novel technique, DAAIN, to detect out-of-distribution (OOD) inputs and adversarial attacks (AA)
Our approach monitors the inner workings of a neural network and learns a density estimator of the activation distribution.
Our model can be trained on a single GPU making it compute efficient and deployable without requiring specialized accelerators.
arXiv Detail & Related papers (2021-05-30T22:07:13Z) - Homophily Outlier Detection in Non-IID Categorical Data [43.51919113927003]
This work introduces a novel outlier detection framework and its two instances to identify outliers in categorical data.
It first defines and incorporates distribution-sensitive outlier factors and their interdependence into a value-value graph-based representation.
The learned value outlierness allows for either direct outlier detection or outlying feature selection.
arXiv Detail & Related papers (2021-03-21T23:29:33Z) - Open Set Recognition with Conditional Probabilistic Generative Models [51.40872765917125]
We propose Conditional Probabilistic Generative Models (CPGM) for open set recognition.
CPGM can detect unknown samples but also classify known classes by forcing different latent features to approximate conditional Gaussian distributions.
Experiment results on multiple benchmark datasets reveal that the proposed method significantly outperforms the baselines.
arXiv Detail & Related papers (2020-08-12T06:23:49Z) - Unsupervised Anomaly Detection with Adversarial Mirrored AutoEncoders [51.691585766702744]
We propose a variant of Adversarial Autoencoder which uses a mirrored Wasserstein loss in the discriminator to enforce better semantic-level reconstruction.
We put forward an alternative measure of anomaly score to replace the reconstruction-based metric.
Our method outperforms the current state-of-the-art methods for anomaly detection on several OOD detection benchmarks.
arXiv Detail & Related papers (2020-03-24T08:26:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.