Null Hypothesis Test for Anomaly Detection
- URL: http://arxiv.org/abs/2210.02226v1
- Date: Wed, 5 Oct 2022 13:03:55 GMT
- Title: Null Hypothesis Test for Anomaly Detection
- Authors: Jernej F. Kamenik, Manuel Szewc
- Abstract summary: We extend the use of Classification Without Labels for anomaly detection with a hypothesis test designed to exclude the background-only hypothesis.
By testing for statistical independence of the two discriminating dataset regions, we are able exclude the background-only hypothesis without relying on fixed anomaly score cuts or extrapolations of background estimates between regions.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We extend the use of Classification Without Labels for anomaly detection with
a hypothesis test designed to exclude the background-only hypothesis. By
testing for statistical independence of the two discriminating dataset regions,
we are able exclude the background-only hypothesis without relying on fixed
anomaly score cuts or extrapolations of background estimates between regions.
The method relies on the assumption of conditional independence of anomaly
score features and dataset regions, which can be ensured using existing
decorrelation techniques. As a benchmark example, we consider the LHC Olympics
dataset where we show that mutual information represents a suitable test for
statistical independence and our method exhibits excellent and robust
performance at different signal fractions even in presence of realistic feature
correlations.
Related papers
- Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Sequential Predictive Two-Sample and Independence Testing [114.4130718687858]
We study the problems of sequential nonparametric two-sample and independence testing.
We build upon the principle of (nonparametric) testing by betting.
arXiv Detail & Related papers (2023-04-29T01:30:33Z) - Empirical Bayesian Approaches for Robust Constraint-based Causal
Discovery under Insufficient Data [38.883810061897094]
Causal discovery methods assume data sufficiency, which may not be the case in many real world datasets.
We propose Bayesian-augmented frequentist independence tests to improve the performance of constraint-based causal discovery methods under insufficient data.
Experiments show significant performance improvement in terms of both accuracy and efficiency over SOTA methods.
arXiv Detail & Related papers (2022-06-16T21:08:49Z) - Nonparametric Conditional Local Independence Testing [69.31200003384122]
Conditional local independence is an independence relation among continuous time processes.
No nonparametric test of conditional local independence has been available.
We propose such a nonparametric test based on double machine learning.
arXiv Detail & Related papers (2022-03-25T10:31:02Z) - Model-agnostic out-of-distribution detection using combined statistical
tests [15.27980070479021]
We present simple methods for out-of-distribution detection using a trained generative model.
We combine a classical parametric test (Rao's score test) with the recently introduced typicality test.
Despite their simplicity and generality, these methods can be competitive with model-specific out-of-distribution detection algorithms.
arXiv Detail & Related papers (2022-03-02T13:32:09Z) - Data-SUITE: Data-centric identification of in-distribution incongruous
examples [81.21462458089142]
Data-SUITE is a data-centric framework to identify incongruous regions of in-distribution (ID) data.
We empirically validate Data-SUITE's performance and coverage guarantees.
arXiv Detail & Related papers (2022-02-17T18:58:31Z) - A Data-Driven Approach to Robust Hypothesis Testing Using Sinkhorn
Uncertainty Sets [12.061662346636645]
We seek the worst-case detector over distributional uncertainty sets centered around the empirical distribution from samples using Sinkhorn distance.
Compared with the Wasserstein robust test, the corresponding least favorable distributions are supported beyond the training samples, which provides a more flexible detector.
arXiv Detail & Related papers (2022-02-09T03:26:15Z) - Density of States Estimation for Out-of-Distribution Detection [69.90130863160384]
DoSE is the density of states estimator.
We demonstrate DoSE's state-of-the-art performance against other unsupervised OOD detectors.
arXiv Detail & Related papers (2020-06-16T16:06:25Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z) - Achieving Equalized Odds by Resampling Sensitive Attributes [13.114114427206678]
We present a flexible framework for learning predictive models that approximately satisfy the equalized odds notion of fairness.
This differentiable functional is used as a penalty driving the model parameters towards equalized odds.
We develop a formal hypothesis test to detect whether a prediction rule violates this property, the first such test in the literature.
arXiv Detail & Related papers (2020-06-08T00:18:34Z) - Universal Data Anomaly Detection via Inverse Generative Adversary
Network [4.162663632560141]
No training data are available for the distribution of anomaly data.
A semi-supervised deep learning technique based on an inverse generative adversary network is proposed.
arXiv Detail & Related papers (2020-01-23T21:11:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.