The Calibration Gap between Model and Human Confidence in Large Language
Models
- URL: http://arxiv.org/abs/2401.13835v1
- Date: Wed, 24 Jan 2024 22:21:04 GMT
- Title: The Calibration Gap between Model and Human Confidence in Large Language
Models
- Authors: Mark Steyvers, Heliodoro Tejeda, Aakriti Kumar, Catarina Belem, Sheer
Karny, Xinyue Hu, Lukas Mayer, Padhraic Smyth
- Abstract summary: Large language models (LLMs) need to be well-calibrated in the sense that they can accurately assess and communicate how likely it is that their predictions are correct.
Recent work has focused on the quality of internal LLM confidence assessments.
This paper explores the disparity between external human confidence in an LLM's responses and the internal confidence of the model.
- Score: 14.539888672603743
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: For large language models (LLMs) to be trusted by humans they need to be
well-calibrated in the sense that they can accurately assess and communicate
how likely it is that their predictions are correct. Recent work has focused on
the quality of internal LLM confidence assessments, but the question remains of
how well LLMs can communicate this internal model confidence to human users.
This paper explores the disparity between external human confidence in an LLM's
responses and the internal confidence of the model. Through experiments
involving multiple-choice questions, we systematically examine human users'
ability to discern the reliability of LLM outputs. Our study focuses on two key
areas: (1) assessing users' perception of true LLM confidence and (2)
investigating the impact of tailored explanations on this perception. The
research highlights that default explanations from LLMs often lead to user
overestimation of both the model's confidence and its' accuracy. By modifying
the explanations to more accurately reflect the LLM's internal confidence, we
observe a significant shift in user perception, aligning it more closely with
the model's actual confidence levels. This adjustment in explanatory approach
demonstrates potential for enhancing user trust and accuracy in assessing LLM
outputs. The findings underscore the importance of transparent communication of
confidence levels in LLMs, particularly in high-stakes applications where
understanding the reliability of AI-generated information is essential.
Related papers
- On Verbalized Confidence Scores for LLMs [25.160810008907397]
Uncertainty quantification for large language models (LLMs) can establish more human trust into their responses.
This work focuses on asking the LLM itself to verbalize its uncertainty with a confidence score as part of its output tokens.
We assess the reliability of verbalized confidence scores with respect to different datasets, models, and prompt methods.
arXiv Detail & Related papers (2024-12-19T11:10:36Z) - Learning to Route LLMs with Confidence Tokens [43.63392143501436]
We study the extent to which large language models can reliably indicate confidence in their answers.
We propose Self-REF, a lightweight training strategy to teach LLMs to express confidence in a reliable manner.
Compared to conventional approaches such as verbalizing confidence and examining token probabilities, we demonstrate empirically that confidence tokens show significant improvements in downstream routing and rejection learning tasks.
arXiv Detail & Related papers (2024-10-17T07:28:18Z) - Understanding Knowledge Drift in LLMs through Misinformation [11.605377799885238]
Large Language Models (LLMs) have revolutionized numerous applications, making them an integral part of our digital ecosystem.
We analyze the susceptibility of state-of-the-art LLMs to factual inaccuracies when they encounter false information in a QnA scenario.
Our experiments reveal that an LLM's uncertainty can increase up to 56.6% when the question is answered incorrectly.
arXiv Detail & Related papers (2024-09-11T08:11:16Z) - Understanding the Relationship between Prompts and Response Uncertainty in Large Language Models [55.332004960574004]
Large language models (LLMs) are widely used in decision-making, but their reliability, especially in critical tasks like healthcare, is not well-established.
This paper investigates how the uncertainty of responses generated by LLMs relates to the information provided in the input prompt.
We propose a prompt-response concept model that explains how LLMs generate responses and helps understand the relationship between prompts and response uncertainty.
arXiv Detail & Related papers (2024-07-20T11:19:58Z) - LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models [69.68379406317682]
We introduce a listener-aware finetuning method (LACIE) to calibrate implicit and explicit confidence markers.
We show that LACIE models the listener, considering not only whether an answer is right, but whether it will be accepted by a listener.
We find that training with LACIE results in 47% fewer incorrect answers being accepted while maintaining the same level of acceptance for correct answers.
arXiv Detail & Related papers (2024-05-31T17:16:38Z) - SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales [29.33581578047835]
SaySelf is a training framework that teaches large language models to express more accurate fine-grained confidence estimates.
In addition, SaySelf directs LLMs to produce self-reflective rationales that clearly identify gaps in their parametric knowledge.
We show that the generated self-reflective rationales are reasonable and can further contribute to the calibration.
arXiv Detail & Related papers (2024-05-31T16:21:16Z) - "I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust [51.542856739181474]
We show how different natural language expressions of uncertainty impact participants' reliance, trust, and overall task performance.
We find that first-person expressions decrease participants' confidence in the system and tendency to agree with the system's answers, while increasing participants' accuracy.
Our findings suggest that using natural language expressions of uncertainty may be an effective approach for reducing overreliance on LLMs, but that the precise language used matters.
arXiv Detail & Related papers (2024-05-01T16:43:55Z) - Calibrating Large Language Models Using Their Generations Only [44.26441565763495]
APRICOT is a method to set confidence targets and train an additional model that predicts an LLM's confidence based on its textual input and output alone.
It is conceptually simple, does not require access to the target model beyond its output, does not interfere with the language generation, and has a multitude of potential usages.
We show how our approach performs competitively in terms of calibration error for white-box and black-box LLMs on closed-book question-answering to detect incorrect LLM answers.
arXiv Detail & Related papers (2024-03-09T17:46:24Z) - Fact-and-Reflection (FaR) Improves Confidence Calibration of Large Language Models [84.94220787791389]
We propose Fact-and-Reflection (FaR) prompting, which improves the LLM calibration in two steps.
Experiments show that FaR achieves significantly better calibration; it lowers the Expected Error by 23.5%.
FaR even elicits the capability of verbally expressing concerns in less confident scenarios.
arXiv Detail & Related papers (2024-02-27T01:37:23Z) - Improving the Reliability of Large Language Models by Leveraging
Uncertainty-Aware In-Context Learning [76.98542249776257]
Large-scale language models often face the challenge of "hallucination"
We introduce an uncertainty-aware in-context learning framework to empower the model to enhance or reject its output in response to uncertainty.
arXiv Detail & Related papers (2023-10-07T12:06:53Z) - Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs [60.61002524947733]
Previous confidence elicitation methods rely on white-box access to internal model information or model fine-tuning.
This leads to a growing need to explore the untapped area of black-box approaches for uncertainty estimation.
We define a systematic framework with three components: prompting strategies for eliciting verbalized confidence, sampling methods for generating multiple responses, and aggregation techniques for computing consistency.
arXiv Detail & Related papers (2023-06-22T17:31:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.