Related papers: Calibrated Prediction Set in Fault Detection with Risk Guarantees via Significance Tests

Calibrated Prediction Set in Fault Detection with Risk Guarantees via Significance Tests

URL: http://arxiv.org/abs/2508.01208v1
Date: Sat, 02 Aug 2025 05:49:02 GMT
Title: Calibrated Prediction Set in Fault Detection with Risk Guarantees via Significance Tests
Authors: Mingchen Mei, Yi Li, YiYao Qian, Zijun Jia,
Abstract summary: This paper proposes a novel fault detection method that integrates significance testing with the conformal prediction framework to provide formal risk guarantees.<n>The proposed method consistently achieves an empirical coverage rate at or above the nominal level ($1-alpha$)<n>The results reveal a controllable trade-off between the user-defined risk level ($alpha$) and efficiency, where higher risk tolerance leads to smaller average prediction set sizes.
Score: 3.500936878570599
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fault detection is crucial for ensuring the safety and reliability of modern industrial systems. However, a significant scientific challenge is the lack of rigorous risk control and reliable uncertainty quantification in existing diagnostic models, particularly when facing complex scenarios such as distributional shifts. To address this issue, this paper proposes a novel fault detection method that integrates significance testing with the conformal prediction framework to provide formal risk guarantees. The method transforms fault detection into a hypothesis testing task by defining a nonconformity measure based on model residuals. It then leverages a calibration dataset to compute p-values for new samples, which are used to construct prediction sets mathematically guaranteed to contain the true label with a user-specified probability, $1-\alpha$. Fault classification is subsequently performed by analyzing the intersection of the constructed prediction set with predefined normal and fault label sets. Experimental results on cross-domain fault diagnosis tasks validate the theoretical properties of our approach. The proposed method consistently achieves an empirical coverage rate at or above the nominal level ($1-\alpha$), demonstrating robustness even when the underlying point-prediction models perform poorly. Furthermore, the results reveal a controllable trade-off between the user-defined risk level ($\alpha$) and efficiency, where higher risk tolerance leads to smaller average prediction set sizes. This research contributes a theoretically grounded framework for fault detection that enables explicit risk control, enhancing the trustworthiness of diagnostic systems in safety-critical applications and advancing the field from simple point predictions to informative, uncertainty-aware outputs.

Related papers

COIN: Uncertainty-Guarding Selective Question Answering for Foundation Models with Provable Risk Guarantees [51.5976496056012]
COIN is an uncertainty-guarding selection framework that calibrates statistically valid thresholds to filter a single generated answer per question.<n>COIN estimates the empirical error rate on a calibration set and applies confidence interval methods to establish a high-probability upper bound on the true error rate.<n>We demonstrate COIN's robustness in risk control, strong test-time power in retaining admissible answers, and predictive efficiency under limited calibration data.
arXiv Detail & Related papers (2025-06-25T07:04:49Z)
Conformal Segmentation in Industrial Surface Defect Detection with Statistical Guarantees [2.0257616108612373]
In industrial settings, surface defects on steel can significantly compromise its service life and elevate potential safety risks.<n>Traditional defect detection methods predominantly rely on manual inspection, which suffers from low efficiency and high costs.<n>We develop a statistically rigorous threshold based on a user-defined risk level to identify high-probability defective pixels in test images.<n>We demonstrate robust and efficient control over the expected test set error rate across varying calibration-to-test ratios.
arXiv Detail & Related papers (2025-04-24T16:33:56Z)
Data-Driven Calibration of Prediction Sets in Large Vision-Language Models Based on Inductive Conformal Prediction [0.0]
We propose a model-agnostic uncertainty quantification method that integrates dynamic threshold calibration and cross-modal consistency verification.<n>We show that the framework achieves stable performance across varying calibration-to-test split ratios, underscoring its robustness for real-world deployment in healthcare, autonomous systems, and other safety-sensitive domains.<n>This work bridges the gap between theoretical reliability and practical applicability in multi-modal AI systems, offering a scalable solution for hallucination detection and uncertainty-aware decision-making.
arXiv Detail & Related papers (2025-04-24T15:39:46Z)
TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data [62.22804234013273]
We propose a simple failure detection framework to unify and facilitate classification with rejection under both covariate and semantic shifts.<n>Our key insight is that by separating and consolidating failure-specific reliability knowledge with low-rank adapters, we can enhance the failure detection ability effectively and flexibly.
arXiv Detail & Related papers (2025-04-20T09:20:55Z)
SConU: Selective Conformal Uncertainty in Large Language Models [59.25881667640868]
We propose a novel approach termed Selective Conformal Uncertainty (SConU)<n>We develop two conformal p-values that are instrumental in determining whether a given sample deviates from the uncertainty distribution of the calibration set at a specific manageable risk level.<n>Our approach not only facilitates rigorous management of miscoverage rates across both single-domain and interdisciplinary contexts, but also enhances the efficiency of predictions.
arXiv Detail & Related papers (2025-04-19T03:01:45Z)
Coverage-Guaranteed Speech Emotion Recognition via Calibrated Uncertainty-Adaptive Prediction Sets [0.0]
Road rage, often triggered by emotional suppression and sudden outbursts, significantly threatens road safety by causing collisions and aggressive behavior.<n>Speech emotion recognition technologies can mitigate this risk by identifying negative emotions early and issuing timely alerts.<n>We propose a novel risk-controlled prediction framework providing statistically rigorous guarantees on prediction accuracy.
arXiv Detail & Related papers (2025-03-24T12:26:28Z)
Conformal Generative Modeling with Improved Sample Efficiency through Sequential Greedy Filtering [55.15192437680943]
Generative models lack rigorous statistical guarantees for their outputs.<n>We propose a sequential conformal prediction method producing prediction sets that satisfy a rigorous statistical guarantee.<n>This guarantee states that with high probability, the prediction sets contain at least one admissible (or valid) example.
arXiv Detail & Related papers (2024-10-02T15:26:52Z)
Leave-One-Out-, Bootstrap- and Cross-Conformal Anomaly Detectors [0.0]
In this work, we formally define and evaluate leave-one-out-, bootstrap-, and cross-conformal methods for anomaly detection.<n>We demonstrate that derived methods for calculating resampling-conformal $p$-values strike a practical compromise between statistical efficiency (full-conformal) and computational efficiency (split-conformal) as they make more efficient use of available data.
arXiv Detail & Related papers (2024-02-26T08:22:40Z)
B-BACN: Bayesian Boundary-Aware Convolutional Network for Crack Characterization [4.447467536572625]
Uncertainty of crack detection is challenging due to various factors, such as measurement noises, signal processing, and model simplifications. A machine learning-based approach is proposed to quantify both uncertainty and aleatoric uncertainties concurrently. We introduce a Boundary-Aware Convolutional Network (B-BACN) that emphasizes uncertainty-aware boundary refinement to generate precise and reliable crack boundary detections.
arXiv Detail & Related papers (2023-02-14T04:50:42Z)
A Review of Uncertainty Calibration in Pretrained Object Detectors [5.440028715314566]
We investigate the uncertainty calibration properties of different pretrained object detection architectures in a multi-class setting. We propose a framework to ensure a fair, unbiased, and repeatable evaluation. We deliver novel insights into why poor detector calibration emerges.
arXiv Detail & Related papers (2022-10-06T14:06:36Z)
Bayesian autoencoders with uncertainty quantification: Towards trustworthy anomaly detection [78.24964622317634]
In this work, the formulation of Bayesian autoencoders (BAEs) is adopted to quantify the total anomaly uncertainty. To evaluate the quality of uncertainty, we consider the task of classifying anomalies with the additional option of rejecting predictions of high uncertainty. Our experiments demonstrate the effectiveness of the BAE and total anomaly uncertainty on a set of benchmark datasets and two real datasets for manufacturing.
arXiv Detail & Related papers (2022-02-25T12:20:04Z)
Trust but Verify: Assigning Prediction Credibility by Counterfactual Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning. These measures should account for the wide variety of models used in practice. The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.