When Models Know When They Do Not Know: Calibration, Cascading, and Cleaning
- URL: http://arxiv.org/abs/2601.07965v1
- Date: Mon, 12 Jan 2026 19:59:03 GMT
- Title: When Models Know When They Do Not Know: Calibration, Cascading, and Cleaning
- Authors: Chenjie Hao, Weyl Lu, Yuko Ishiwaka, Zengyi Li, Weier Wan, Yubei Chen,
- Abstract summary: A promising approach is to use confidence, computed from the model's internal signals, to reflect its ignorance.<n>We propose a simple, effective, and universal training-free method that applies to both vision and language models.<n>Our results demonstrate that enabling models to recognize when they do not know is a practical step toward more efficient, reliable, and trustworthy AI.
- Score: 10.585100830578934
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When a model knows when it does not know, many possibilities emerge. The first question is how to enable a model to recognize that it does not know. A promising approach is to use confidence, computed from the model's internal signals, to reflect its ignorance. Prior work in specific domains has shown that calibration can provide reliable confidence estimates. In this work, we propose a simple, effective, and universal training-free method that applies to both vision and language models, performing model calibration, cascading, and data cleaning to better exploit a model's ability to recognize when it does not know. We first highlight two key empirical observations: higher confidence corresponds to higher accuracy within a single model, and models calibrated on the validation set remain calibrated on a held-out test set. These findings empirically establish the reliability and comparability of calibrated confidence. Building on this, we introduce two applications: (1) model cascading with calibrated advantage routing and (2) data cleaning based on model ensemble. Using the routing signal derived from the comparability of calibrated confidences, we cascade large and small models to improve efficiency with almost no compromise in accuracy, and we further cascade two models of comparable scale to achieve performance beyond either model alone. Leveraging multiple experts and their calibrated confidences, we design a simple yet effective data-cleaning method that balances precision and detection rate to identify mislabeled samples in ImageNet and Massive Multitask Language Understanding (MMLU) datasets. Our results demonstrate that enabling models to recognize when they do not know is a practical step toward more efficient, reliable, and trustworthy AI.
Related papers
- On Calibration of Large Language Models: From Response To Capability [66.59139960234326]
Large language models (LLMs) are widely deployed as general-purpose problem solvers.<n>We introduce capability calibration, which targets the model's expected accuracy on a query.<n>Our results demonstrate that capability-calibrated confidence improves pass@$k$ prediction and inference budget allocation.
arXiv Detail & Related papers (2026-02-14T01:07:45Z) - Calibrating Large Language Models with Sample Consistency [76.23956851098598]
We explore the potential of deriving confidence from the distribution of multiple randomly sampled model generations, via three measures of consistency.
Results show that consistency-based calibration methods outperform existing post-hoc approaches.
We offer practical guidance on choosing suitable consistency metrics for calibration, tailored to the characteristics of various LMs.
arXiv Detail & Related papers (2024-02-21T16:15:20Z) - Selective Learning: Towards Robust Calibration with Dynamic Regularization [79.92633587914659]
Miscalibration in deep learning refers to there is a discrepancy between the predicted confidence and performance.
We introduce Dynamic Regularization (DReg) which aims to learn what should be learned during training thereby circumventing the confidence adjusting trade-off.
arXiv Detail & Related papers (2024-02-13T11:25:20Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - Calibrated Interpretation: Confidence Estimation in Semantic Parsing [37.28245521206576]
We investigate the calibration of popular generation models across four popular semantic parsing datasets.
We analyze factors associated with calibration error and release new confidence-based challenge splits of two parsing datasets.
arXiv Detail & Related papers (2022-11-14T15:17:55Z) - Calibration Meets Explanation: A Simple and Effective Approach for Model
Confidence Estimates [21.017890579840145]
We propose a method named CME that leverages model explanations to make the model less confident with non-inductive attributions.
We conduct extensive experiments on six datasets with two popular pre-trained language models.
Our findings highlight that model explanations can help calibrate posterior estimates.
arXiv Detail & Related papers (2022-11-06T06:17:21Z) - Reliability-Aware Prediction via Uncertainty Learning for Person Image
Retrieval [51.83967175585896]
UAL aims at providing reliability-aware predictions by considering data uncertainty and model uncertainty simultaneously.
Data uncertainty captures the noise" inherent in the sample, while model uncertainty depicts the model's confidence in the sample's prediction.
arXiv Detail & Related papers (2022-10-24T17:53:20Z) - Is my Driver Observation Model Overconfident? Input-guided Calibration
Networks for Reliable and Interpretable Confidence Estimates [23.449073032842076]
Driver observation models are rarely deployed under perfect conditions.
We show that raw neural network-based approaches tend to significantly overestimate their prediction quality.
We introduce Calibrated Action Recognition with Input Guidance (CARING)-a novel approach leveraging an additional neural network to learn scaling the confidences depending on the video representation.
arXiv Detail & Related papers (2022-04-10T12:43:58Z) - Uncertainty-sensitive Activity Recognition: a Reliability Benchmark and
the CARING Models [37.60817779613977]
We present the first study of how welthe confidence values of modern action recognition architectures indeed reflect the probability of the correct outcome.
We introduce a new approach which learns to transform the model output into realistic confidence estimates through an additional calibration network.
arXiv Detail & Related papers (2021-01-02T15:41:21Z) - How Can We Know When Language Models Know? On the Calibration of
Language Models for Question Answering [80.82194311274694]
We examine the question "how can we know when language models know, with confidence, the answer to a particular query?"
We examine three strong generative models -- T5, BART, and GPT-2 -- and study whether their probabilities on QA tasks are well calibrated.
We then examine methods to calibrate such models to make their confidence scores correlate better with the likelihood of correctness.
arXiv Detail & Related papers (2020-12-02T03:53:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.