Related papers: Domain-Shift-Aware Conformal Prediction for Large Language Models

Domain-Shift-Aware Conformal Prediction for Large Language Models

URL: http://arxiv.org/abs/2510.05566v1
Date: Tue, 07 Oct 2025 04:22:06 GMT
Title: Domain-Shift-Aware Conformal Prediction for Large Language Models
Authors: Zhexiao Lin, Yuanyuan Li, Neeraj Sarna, Yuanyuan Gao, Michael von Gablenz,
Abstract summary: We propose a new framework called Domain-Shift-Aware Conformal Prediction (DS-CP)<n>Our framework adapts conformal prediction to large language models under domain shift, by systematically reweighting calibration samples.<n>Our theoretical analysis and experiments on the MMLU benchmark demonstrate that the proposed method delivers more reliable coverage than standard conformal prediction.
Score: 8.620363085499243
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models have achieved impressive performance across diverse tasks. However, their tendency to produce overconfident and factually incorrect outputs, known as hallucinations, poses risks in real world applications. Conformal prediction provides finite-sample, distribution-free coverage guarantees, but standard conformal prediction breaks down under domain shift, often leading to under-coverage and unreliable prediction sets. We propose a new framework called Domain-Shift-Aware Conformal Prediction (DS-CP). Our framework adapts conformal prediction to large language models under domain shift, by systematically reweighting calibration samples based on their proximity to the test prompt, thereby preserving validity while enhancing adaptivity. Our theoretical analysis and experiments on the MMLU benchmark demonstrate that the proposed method delivers more reliable coverage than standard conformal prediction, especially under substantial distribution shifts, while maintaining efficiency. This provides a practical step toward trustworthy uncertainty quantification for large language models in real-world deployment.

Related papers

Uncertainty Quantification for Named Entity Recognition via Full-Sequence and Subsequence Conformal Prediction [0.0]
We introduce a general framework for adapting sequence-labeling-based NER models to produce uncertainty-aware prediction sets.<n>Prediction sets are collections of full-sentence labelings guaranteed to contain the correct labeling with a user-specified confidence level.
arXiv Detail & Related papers (2026-01-13T18:00:08Z)
Distribution-informed Online Conformal Prediction [53.674678995825666]
We propose Conformal Optimistic Prediction (COP), an online conformal prediction algorithm incorporating underlying data pattern into the update rule.<n>COP produces tighter prediction sets when predictable pattern exists, while retaining valid coverage guarantees even when estimates are inaccurate.<n>We prove that COP can achieve valid coverage and construct shorter prediction intervals than other baselines.
arXiv Detail & Related papers (2025-12-08T17:51:49Z)
Uncalibrated Reasoning: GRPO Induces Overconfidence for Stochastic Outcomes [55.2480439325792]
Reinforcement learning (RL) has proven remarkably effective at improving the accuracy of language models in verifiable and deterministic domains like mathematics.<n>Here, we examine if current RL methods are also effective at optimizing language models in verifiable domains with outcomes, like scientific experiments.
arXiv Detail & Related papers (2025-08-15T20:50:53Z)
Conformal Prediction Adaptive to Unknown Subpopulation Shifts [11.046912341345294]
Conformal prediction is widely used to equip black-box machine learning models with uncertainty quantification enjoying formal coverage guarantees.<n>In this work, we address subpopulation shifts, where the test environment exhibits an unknown and differing mixture of subpopulations compared to the calibration data.<n>We propose new methods that provably adapt conformal prediction to such shifts, ensuring valid coverage without requiring explicit knowledge of subpopulation structure.
arXiv Detail & Related papers (2025-06-05T20:58:39Z)
JAPAN: Joint Adaptive Prediction Areas with Normalising-Flows [7.200880964149064]
Conformal prediction provides a model-agnostic framework for uncertainty quantification with finite-sample validity guarantees.<n>Existing approaches commonly rely on residual-based conformity scores, which impose geometric constraints.<n>We introduce JAPAN (Joint Adaptive Prediction Areas with Normalising-Flows), a conformal prediction framework that uses density-based conformity scores.
arXiv Detail & Related papers (2025-05-29T07:34:51Z)
SConU: Selective Conformal Uncertainty in Large Language Models [59.25881667640868]
We propose a novel approach termed Selective Conformal Uncertainty (SConU)<n>We develop two conformal p-values that are instrumental in determining whether a given sample deviates from the uncertainty distribution of the calibration set at a specific manageable risk level.<n>Our approach not only facilitates rigorous management of miscoverage rates across both single-domain and interdisciplinary contexts, but also enhances the efficiency of predictions.
arXiv Detail & Related papers (2025-04-19T03:01:45Z)
Beyond Conformal Predictors: Adaptive Conformal Inference with Confidence Predictors [1.3812010983144802]
This study shows that the desirable properties of Adaptive Conformal Inference (ACI) do not require the use of Conformal Predictors (CP)<n>We empirically investigate the performance of Non-Conformal Confidence Predictors (NCCP) against CP when used with ACI on non-exchangeable data.
arXiv Detail & Related papers (2024-09-23T21:02:33Z)
Identifying and Mitigating Social Bias Knowledge in Language Models [52.52955281662332]
We propose a novel debiasing approach, Fairness Stamp (FAST), which enables fine-grained calibration of individual social biases.<n>FAST surpasses state-of-the-art baselines with superior debiasing performance.<n>This highlights the potential of fine-grained debiasing strategies to achieve fairness in large language models.
arXiv Detail & Related papers (2024-08-07T17:14:58Z)
Probabilistic Conformal Prediction with Approximate Conditional Validity [81.30551968980143]
We develop a new method for generating prediction sets that combines the flexibility of conformal methods with an estimate of the conditional distribution. Our method consistently outperforms existing approaches in terms of conditional coverage.
arXiv Detail & Related papers (2024-07-01T20:44:48Z)
Non-Exchangeable Conformal Language Generation with Nearest Neighbors [12.790082627386482]
Non-exchangeable conformal nucleus sampling is a novel extension of the conformal prediction framework to generation based on nearest neighbors. Our method can be used post-hoc for an arbitrary model without extra training and supplies token-level, calibrated prediction sets equipped with statistical guarantees.
arXiv Detail & Related papers (2024-02-01T16:04:04Z)
Multiclass Alignment of Confidence and Certainty for Network Calibration [10.15706847741555]
Recent studies reveal that deep neural networks (DNNs) are prone to making overconfident predictions. We propose a new train-time calibration method, which features a simple, plug-and-play auxiliary loss known as multi-class alignment of predictive mean confidence and predictive certainty (MACC) Our method achieves state-of-the-art calibration performance for both in-domain and out-domain predictions.
arXiv Detail & Related papers (2023-09-06T00:56:24Z)
When Does Confidence-Based Cascade Deferral Suffice? [69.28314307469381]
Cascades are a classical strategy to enable inference cost to vary adaptively across samples. A deferral rule determines whether to invoke the next classifier in the sequence, or to terminate prediction. Despite being oblivious to the structure of the cascade, confidence-based deferral often works remarkably well in practice.
arXiv Detail & Related papers (2023-07-06T04:13:57Z)
Federated Conformal Predictors for Distributed Uncertainty Quantification [83.50609351513886]
Conformal prediction is emerging as a popular paradigm for providing rigorous uncertainty quantification in machine learning. In this paper, we extend conformal prediction to the federated learning setting. We propose a weaker notion of partial exchangeability, better suited to the FL setting, and use it to develop the Federated Conformal Prediction framework.
arXiv Detail & Related papers (2023-05-27T19:57:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.