Context-Aware Online Conformal Anomaly Detection with Prediction-Powered Data Acquisition
- URL: http://arxiv.org/abs/2505.01783v1
- Date: Sat, 03 May 2025 10:58:05 GMT
- Title: Context-Aware Online Conformal Anomaly Detection with Prediction-Powered Data Acquisition
- Authors: Amirmohammad Farzaneh, Osvaldo Simeone,
- Abstract summary: We introduce context-aware prediction-powered conformal online anomaly detection (C-PP-COAD)<n>Our framework strategically leverages synthetic calibration data to mitigate data scarcity, while adaptively integrating real data based on contextual cues.<n>Experiments conducted on both synthetic and real-world datasets demonstrate that C-PP-COAD significantly reduces dependency on real calibration data without compromising guaranteed false discovery rate (FDR)
- Score: 35.59201763567714
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Online anomaly detection is essential in fields such as cybersecurity, healthcare, and industrial monitoring, where promptly identifying deviations from expected behavior can avert critical failures or security breaches. While numerous anomaly scoring methods based on supervised or unsupervised learning have been proposed, current approaches typically rely on a continuous stream of real-world calibration data to provide assumption-free guarantees on the false discovery rate (FDR). To address the inherent challenges posed by limited real calibration data, we introduce context-aware prediction-powered conformal online anomaly detection (C-PP-COAD). Our framework strategically leverages synthetic calibration data to mitigate data scarcity, while adaptively integrating real data based on contextual cues. C-PP-COAD utilizes conformal p-values, active p-value statistics, and online FDR control mechanisms to maintain rigorous and reliable anomaly detection performance over time. Experiments conducted on both synthetic and real-world datasets demonstrate that C-PP-COAD significantly reduces dependency on real calibration data without compromising guaranteed FDR control.
Related papers
- COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees [51.5976496056012]
COIN is an uncertainty-guarding selection framework that calibrates statistically valid thresholds to filter a single generated answer per question.<n>COIN estimates the empirical error rate on a calibration set and applies confidence interval methods to establish a high-probability upper bound on the true error rate.<n>We demonstrate COIN's robustness in risk control, strong test-time power in retaining admissible answers, and predictive efficiency under limited calibration data.
arXiv Detail & Related papers (2025-06-25T07:04:49Z) - WATCH: Adaptive Monitoring for AI Deployments via Weighted-Conformal Martingales [13.807613678989664]
Methods for nonparametric sequential testing -- especially conformal test martingales (CTMs) and anytime-valid inference -- offer promising tools for this monitoring task.<n>Existing approaches are restricted to monitoring limited hypothesis classes or alarm criteria''
arXiv Detail & Related papers (2025-05-07T17:53:47Z) - Conformal Segmentation in Industrial Surface Defect Detection with Statistical Guarantees [2.0257616108612373]
In industrial settings, surface defects on steel can significantly compromise its service life and elevate potential safety risks.<n>Traditional defect detection methods predominantly rely on manual inspection, which suffers from low efficiency and high costs.<n>We develop a statistically rigorous threshold based on a user-defined risk level to identify high-probability defective pixels in test images.<n>We demonstrate robust and efficient control over the expected test set error rate across varying calibration-to-test ratios.
arXiv Detail & Related papers (2025-04-24T16:33:56Z) - Robust Conformal Outlier Detection under Contaminated Reference Data [20.864605211132663]
Conformal prediction is a flexible framework for calibrating machine learning predictions.<n>In outlier detection, this calibration relies on a reference set of labeled inlier data to control the type-I error rate.<n>This paper analyzes the impact of contamination on the validity of conformal methods.
arXiv Detail & Related papers (2025-02-07T10:23:25Z) - Noise-Adaptive Conformal Classification with Marginal Coverage [53.74125453366155]
We introduce an adaptive conformal inference method capable of efficiently handling deviations from exchangeability caused by random label noise.<n>We validate our method through extensive numerical experiments demonstrating its effectiveness on synthetic and real data sets.
arXiv Detail & Related papers (2025-01-29T23:55:23Z) - Federated Learning for Efficient Condition Monitoring and Anomaly Detection in Industrial Cyber-Physical Systems [0.30723404270319693]
This paper introduces an enhanced FL framework with three key innovations: adaptive model aggregation based on sensor reliability, dynamic node selection for resource optimization, and Weibull-based checkpointing for fault tolerance.<n> Experiments on the NASA Bearing and Hydraulic System datasets demonstrate superior performance compared to state-of-the-art FL methods, achieving 99.5% AUC-ROC in anomaly detection and maintaining accuracy even under node failures.
arXiv Detail & Related papers (2025-01-28T03:04:47Z) - Reshaping the Online Data Buffering and Organizing Mechanism for Continual Test-Time Adaptation [49.53202761595912]
Continual Test-Time Adaptation involves adapting a pre-trained source model to continually changing unsupervised target domains.
We analyze the challenges of this task: online environment, unsupervised nature, and the risks of error accumulation and catastrophic forgetting.
We propose an uncertainty-aware buffering approach to identify and aggregate significant samples with high certainty from the unsupervised, single-pass data stream.
arXiv Detail & Related papers (2024-07-12T15:48:40Z) - Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors [0.0]
In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for anomaly detection.<n>We demonstrate that derived methods for calculating resampling-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) as they make more efficient use of available data.
arXiv Detail & Related papers (2024-02-26T08:22:40Z) - PAC-Based Formal Verification for Out-of-Distribution Data Detection [4.406331747636832]
This study places probably approximately correct (PAC) based guarantees on OOD detection using the encoding process within VAEs.
It is used to bound the detection error on unfamiliar instances with user-defined confidence.
arXiv Detail & Related papers (2023-04-04T07:33:02Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Unsupervised Domain Adaptation for Speech Recognition via Uncertainty
Driven Self-Training [55.824641135682725]
Domain adaptation experiments using WSJ as a source domain and TED-LIUM 3 as well as SWITCHBOARD show that up to 80% of the performance of a system trained on ground-truth data can be recovered.
arXiv Detail & Related papers (2020-11-26T18:51:26Z) - Provably Efficient Causal Reinforcement Learning with Confounded
Observational Data [135.64775986546505]
We study how to incorporate the dataset (observational data) collected offline, which is often abundantly available in practice, to improve the sample efficiency in the online setting.
We propose the deconfounded optimistic value iteration (DOVI) algorithm, which incorporates the confounded observational data in a provably efficient manner.
arXiv Detail & Related papers (2020-06-22T14:49:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.