Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors
- URL: http://arxiv.org/abs/2402.16388v3
- Date: Thu, 20 Feb 2025 13:28:41 GMT
- Title: Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors
- Authors: Oliver Hennhöfer, Christine Preisach,
- Abstract summary: In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for anomaly detection.
We demonstrate that derived methods for calculating resampling-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) as they make more efficient use of available data.
- Score: 0.0
- License:
- Abstract: The requirement of uncertainty quantification for anomaly detection systems has become increasingly important. In this context, effectively controlling Type I error rates ($\alpha$) without compromising the statistical power ($1-\beta$) of these systems can build trust and reduce costs related to false discoveries. The field of conformal anomaly detection emerges as a promising approach for providing respective statistical guarantees by model calibration. However, the dependency on calibration data poses practical limitations - especially within low-data regimes. In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for anomaly detection, incrementing on methods from the field of conformal prediction. Looking beyond the classical inductive conformal anomaly detection, we demonstrate that derived methods for calculating resampling-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) as they make more efficient use of available data. We validate derived methods and quantify their improvements for a range of one-class classifiers and datasets.
Related papers
- Robust Conformal Outlier Detection under Contaminated Reference Data [20.864605211132663]
Conformal prediction is a flexible framework for calibrating machine learning predictions.
In outlier detection, this calibration relies on a reference set of labeled inlier data to control the type-I error rate.
This paper analyzes the impact of contamination on the validity of conformal methods.
arXiv Detail & Related papers (2025-02-07T10:23:25Z) - Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise.
We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z) - Adaptive Deviation Learning for Visual Anomaly Detection with Data Contamination [20.4008901760593]
We introduce a systematic adaptive method that employs deviation learning to compute anomaly scores end-to-end.
Our proposed method surpasses competing techniques and exhibits both stability and robustness in the presence of data contamination.
arXiv Detail & Related papers (2024-11-14T16:10:15Z) - Source-Free Domain-Invariant Performance Prediction [68.39031800809553]
We propose a source-free approach centred on uncertainty-based estimation, using a generative model for calibration in the absence of source data.
Our experiments on benchmark object recognition datasets reveal that existing source-based methods fall short with limited source sample availability.
Our approach significantly outperforms the current state-of-the-art source-free and source-based methods, affirming its effectiveness in domain-invariant performance estimation.
arXiv Detail & Related papers (2024-08-05T03:18:58Z) - Fault Detection and Monitoring using a Data-Driven Information-Based Strategy: Method, Theory, and Application [5.056456697289351]
We propose an information-driven fault detection method based on a novel concept drift detector.
The method is tailored to identifying drifts in input-output relationships of additive noise models.
We prove several theoretical properties of the proposed MI-based fault detection scheme.
arXiv Detail & Related papers (2024-05-06T17:43:39Z) - Condition Monitoring with Incomplete Data: An Integrated Variational Autoencoder and Distance Metric Framework [2.7898966850590625]
This paper introduces a new method for fault detection and condition monitoring for unseen data.
We use a variational autoencoder to capture the probabilistic distribution of previously seen and new unseen conditions.
Faults are detected by establishing a threshold for the health indexes, allowing the model to identify severe, unseen faults with high accuracy, even amidst noise.
arXiv Detail & Related papers (2024-04-08T22:20:23Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - Calibration-Aware Bayesian Learning [37.82259435084825]
This paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs)
It applies both data-dependent or data-independent regularizers while optimizing over a variational distribution as in Bayesian learning.
Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
arXiv Detail & Related papers (2023-05-12T14:19:15Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.