Adaptive novelty detection with false discovery rate guarantee
- URL: http://arxiv.org/abs/2208.06685v3
- Date: Wed, 25 Oct 2023 11:29:17 GMT
- Title: Adaptive novelty detection with false discovery rate guarantee
- Authors: Ariane Marandon, Lihua Lei, David Mary and Etienne Roquain
- Abstract summary: We propose a flexible method to control the false discovery rate (FDR) on detected novelties in finite samples.
Inspired by the multiple testing literature, we propose variants of AdaDetect that are adaptive to the proportion of nulls.
The methods are illustrated on synthetic datasets and real-world datasets, including an application in astrophysics.
- Score: 1.8249324194382757
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper studies the semi-supervised novelty detection problem where a set
of "typical" measurements is available to the researcher. Motivated by recent
advances in multiple testing and conformal inference, we propose AdaDetect, a
flexible method that is able to wrap around any probabilistic classification
algorithm and control the false discovery rate (FDR) on detected novelties in
finite samples without any distributional assumption other than
exchangeability. In contrast to classical FDR-controlling procedures that are
often committed to a pre-specified p-value function, AdaDetect learns the
transformation in a data-adaptive manner to focus the power on the directions
that distinguish between inliers and outliers. Inspired by the multiple testing
literature, we further propose variants of AdaDetect that are adaptive to the
proportion of nulls while maintaining the finite-sample FDR control. The
methods are illustrated on synthetic datasets and real-world datasets,
including an application in astrophysics.
Related papers
- Partially-Observable Sequential Change-Point Detection for Autocorrelated Data via Upper Confidence Region [12.645304808491309]
We propose a detection scheme called adaptive upper confidence region with state space model (AUCRSS) for sequential change point detection.
A partially-observable Kalman filter algorithm is developed for online inference of SSM, and accordingly, a change point detection scheme based on a generalized likelihood ratio test is analyzed.
arXiv Detail & Related papers (2024-03-30T02:32:53Z) - Hypothesis-Driven Deep Learning for Out of Distribution Detection [0.8191518216608217]
We propose a hypothesis-driven approach to quantify whether a new sample is InD or OoD.
We adapt our method to detect an unseen sample of bacteria to a trained deep learning model, and show that it reveals interpretable differences between InD and OoD latent responses.
arXiv Detail & Related papers (2024-03-21T01:06:47Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Conservative Prediction via Data-Driven Confidence Minimization [70.93946578046003]
In safety-critical applications of machine learning, it is often desirable for a model to be conservative.
We propose the Data-Driven Confidence Minimization framework, which minimizes confidence on an uncertainty dataset.
arXiv Detail & Related papers (2023-06-08T07:05:36Z) - Self-Supervised Training with Autoencoders for Visual Anomaly Detection [61.62861063776813]
We focus on a specific use case in anomaly detection where the distribution of normal samples is supported by a lower-dimensional manifold.
We adapt a self-supervised learning regime that exploits discriminative information during training but focuses on the submanifold of normal examples.
We achieve a new state-of-the-art result on the MVTec AD dataset -- a challenging benchmark for visual anomaly detection in the manufacturing domain.
arXiv Detail & Related papers (2022-06-23T14:16:30Z) - TracInAD: Measuring Influence for Anomaly Detection [0.0]
This paper proposes a novel methodology to flag anomalies based on TracIn.
We test our approach using Variational Autoencoders and show that the average influence of a subsample of training points on a test point can serve as a proxy for abnormality.
arXiv Detail & Related papers (2022-05-03T08:20:15Z) - iDECODe: In-distribution Equivariance for Conformal Out-of-distribution
Detection [24.518698391381204]
Machine learning methods such as deep neural networks (DNNs) often generate incorrect predictions with high confidence.
We propose the new method iDECODe, leveraging in-distribution equivariance for conformal OOD detection.
We demonstrate the efficacy of iDECODe by experiments on image and audio datasets, obtaining state-of-the-art results.
arXiv Detail & Related papers (2022-01-07T05:21:40Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Partially Observable Online Change Detection via Smooth-Sparse
Decomposition [16.8028358824706]
We consider online change detection of high dimensional data streams with sparse changes, where only a subset of data streams can be observed at each sensing time point due to limited sensing capacities.
On the one hand, the detection scheme should be able to deal with partially observable data and meanwhile have efficient detection power for sparse changes.
In this paper, we propose a novel detection scheme called CDSSD. In particular, it describes the structure of high dimensional data with sparse changes by smooth-sparse decomposition.
arXiv Detail & Related papers (2020-09-22T16:03:04Z) - Change Point Detection in Time Series Data using Autoencoders with a
Time-Invariant Representation [69.34035527763916]
Change point detection (CPD) aims to locate abrupt property changes in time series data.
Recent CPD methods demonstrated the potential of using deep learning techniques, but often lack the ability to identify more subtle changes in the autocorrelation statistics of the signal.
We employ an autoencoder-based methodology with a novel loss function, through which the used autoencoders learn a partially time-invariant representation that is tailored for CPD.
arXiv Detail & Related papers (2020-08-21T15:03:21Z) - Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass.
We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.