Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors
- URL: http://arxiv.org/abs/2402.16388v3
- Date: Thu, 20 Feb 2025 13:28:41 GMT
- Title: Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors
- Authors: Oliver Hennhöfer, Christine Preisach,
- Abstract summary: In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for anomaly detection.<n>We demonstrate that derived methods for calculating resampling-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) as they make more efficient use of available data.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The requirement of uncertainty quantification for anomaly detection systems has become increasingly important. In this context, effectively controlling Type I error rates ($\alpha$) without compromising the statistical power ($1-\beta$) of these systems can build trust and reduce costs related to false discoveries. The field of conformal anomaly detection emerges as a promising approach for providing respective statistical guarantees by model calibration. However, the dependency on calibration data poses practical limitations - especially within low-data regimes. In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for anomaly detection, incrementing on methods from the field of conformal prediction. Looking beyond the classical inductive conformal anomaly detection, we demonstrate that derived methods for calculating resampling-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) as they make more efficient use of available data. We validate derived methods and quantify their improvements for a range of one-class classifiers and datasets.
Related papers
- Conformal Segmentation in Industrial Surface Defect Detection with Statistical Guarantees [2.0257616108612373]
In industrial settings, surface defects on steel can significantly compromise its service life and elevate potential safety risks.
Traditional defect detection methods predominantly rely on manual inspection, which suffers from low efficiency and high costs.
We develop a statistically rigorous threshold based on a user-defined risk level to identify high-probability defective pixels in test images.
We demonstrate robust and efficient control over the expected test set error rate across varying calibration-to-test ratios.
arXiv Detail & Related papers (2025-04-24T16:33:56Z) - Robust Conformal Outlier Detection under Contaminated Reference Data [20.864605211132663]
Conformal prediction is a flexible framework for calibrating machine learning predictions.
In outlier detection, this calibration relies on a reference set of labeled inlier data to control the type-I error rate.
This paper analyzes the impact of contamination on the validity of conformal methods.
arXiv Detail & Related papers (2025-02-07T10:23:25Z) - Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise.
We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z) - Adaptive Deviation Learning for Visual Anomaly Detection with Data Contamination [20.4008901760593]
We introduce a systematic adaptive method that employs deviation learning to compute anomaly scores end-to-end.
Our proposed method surpasses competing techniques and exhibits both stability and robustness in the presence of data contamination.
arXiv Detail & Related papers (2024-11-14T16:10:15Z) - Source-Free Domain-Invariant Performance Prediction [68.39031800809553]
We propose a source-free approach centred on uncertainty-based estimation, using a generative model for calibration in the absence of source data.
Our experiments on benchmark object recognition datasets reveal that existing source-based methods fall short with limited source sample availability.
Our approach significantly outperforms the current state-of-the-art source-free and source-based methods, affirming its effectiveness in domain-invariant performance estimation.
arXiv Detail & Related papers (2024-08-05T03:18:58Z) - Fault Detection and Monitoring using a Data-Driven Information-Based Strategy: Method, Theory, and Application [5.056456697289351]
We propose an information-driven fault detection method based on a novel concept drift detector.
The method is tailored to identifying drifts in input-output relationships of additive noise models.
We prove several theoretical properties of the proposed MI-based fault detection scheme.
arXiv Detail & Related papers (2024-05-06T17:43:39Z) - Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection [1.8990839669542954]
We propose a cost-sensitive framework for object detection tailored to user-defined budgets.
We derive minimum thresholding requirements to prevent performance degradation.
We automate and optimize the thresholding process to maximize the failure recognition rate.
arXiv Detail & Related papers (2024-04-26T14:03:55Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - Distributional Shift-Aware Off-Policy Interval Estimation: A Unified
Error Quantification Framework [8.572441599469597]
We study high-confidence off-policy evaluation in the context of infinite-horizon Markov decision processes.
The objective is to establish a confidence interval (CI) for the target policy value using only offline data pre-collected from unknown behavior policies.
We show that our algorithm is sample-efficient, error-robust, and provably convergent even in non-linear function approximation settings.
arXiv Detail & Related papers (2023-09-23T06:35:44Z) - Quantification of Predictive Uncertainty via Inference-Time Sampling [57.749601811982096]
We propose a post-hoc sampling strategy for estimating predictive uncertainty accounting for data ambiguity.
The method can generate different plausible outputs for a given input and does not assume parametric forms of predictive distributions.
arXiv Detail & Related papers (2023-08-03T12:43:21Z) - Calibration-Aware Bayesian Learning [37.82259435084825]
This paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs)
It applies both data-dependent or data-independent regularizers while optimizing over a variational distribution as in Bayesian learning.
Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
arXiv Detail & Related papers (2023-05-12T14:19:15Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Statistics and Deep Learning-based Hybrid Model for Interpretable
Anomaly Detection [0.0]
Hybrid methods have been shown to outperform pure statistical and pure deep learning methods at both forecasting tasks.
MES-LSTM is an interpretable anomaly detection model that overcomes these challenges.
arXiv Detail & Related papers (2022-02-25T14:17:03Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data.
This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z) - Deep Learning based Uncertainty Decomposition for Real-time Control [9.067368638784355]
We propose a novel method for detecting the absence of training data using deep learning.
We show its advantages over existing approaches on synthetic and real-world datasets.
We further demonstrate the practicality of this uncertainty estimate in deploying online data-efficient control on a simulated quadcopter.
arXiv Detail & Related papers (2020-10-06T10:46:27Z) - Evaluating probabilistic classifiers: Reliability diagrams and score
decompositions revisited [68.8204255655161]
We introduce the CORP approach, which generates provably statistically Consistent, Optimally binned, and Reproducible reliability diagrams in an automated way.
Corpor is based on non-parametric isotonic regression and implemented via the Pool-adjacent-violators (PAV) algorithm.
arXiv Detail & Related papers (2020-08-07T08:22:26Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Learning to Predict Error for MRI Reconstruction [67.76632988696943]
We demonstrate that predictive uncertainty estimated by the current methods does not highly correlate with prediction error.
We propose a novel method that estimates the target labels and magnitude of the prediction error in two steps.
arXiv Detail & Related papers (2020-02-13T15:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.