Confidence Calibration and Rationalization for LLMs via Multi-Agent Deliberation
- URL: http://arxiv.org/abs/2404.09127v3
- Date: Fri, 10 May 2024 16:38:23 GMT
- Title: Confidence Calibration and Rationalization for LLMs via Multi-Agent Deliberation
- Authors: Ruixin Yang, Dheeraj Rajagopal, Shirley Anugrah Hayati, Bin Hu, Dongyeop Kang,
- Abstract summary: Existing calibration methods for large language models (LLMs) focus on estimating or eliciting individual confidence without taking full advantage of the "Collective Wisdom"
We propose Collaborative, a post-hoc training-free calibration strategy that leverages the collaborative and expressive capabilities of multiple tool-augmented LLM agents in a simulated group deliberation process.
- Score: 18.815226646364476
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Uncertainty estimation is a significant issue for current large language models (LLMs) that are generally poorly calibrated and over-confident, especially with reinforcement learning from human feedback (RLHF). Unlike humans, whose decisions and confidences not only stem from intrinsic beliefs but can also be adjusted through daily observations, existing calibration methods for LLMs focus on estimating or eliciting individual confidence without taking full advantage of the "Collective Wisdom": the interaction among multiple LLMs that can collectively improve both accuracy and calibration. In this work, we propose Collaborative Calibration, a post-hoc training-free calibration strategy that leverages the collaborative and expressive capabilities of multiple tool-augmented LLM agents in a simulated group deliberation process. We demonstrate the effectiveness of Collaborative Calibration on generative QA tasks across various domains, showing its potential in harnessing the rationalization of collectively calibrated confidence assessments and improving the reliability of model predictions.
Related papers
- Enhancing Language Model Factuality via Activation-Based Confidence Calibration and Guided Decoding [7.855485779946983]
Calibrating language models (LMs) align their generation confidence with the actual likelihood of answer correctness.
We propose an activation-based calibration method, ActCab, which trains a linear layer on top of the LM's last-layer activations.
We also propose CoDec, a confidence-guided decoding strategy, to elicit truthful answers with high confidence from LMs.
arXiv Detail & Related papers (2024-06-19T05:33:34Z) - Multicalibration for Confidence Scoring in LLMs [6.948522445499497]
This paper proposes the use of "multicalibration" to yield interpretable and reliable confidence scores for outputs generated by large language models (LLMs)
We show how to form groupings for prompt/completion pairs that are correlated with the probability of correctness via two techniques: clustering within an embedding space, and "self-annotation"
We show how our techniques can yield confidence scores that provide substantial improvements in fine-grained measures of both calibration and accuracy compared to existing methods.
arXiv Detail & Related papers (2024-04-06T17:33:37Z) - Calibrating Large Language Models with Sample Consistency [76.23956851098598]
We explore the potential of deriving confidence from the distribution of multiple randomly sampled model generations, via three measures of consistency.
Results show that consistency-based calibration methods outperform existing post-hoc approaches.
We offer practical guidance on choosing suitable consistency metrics for calibration, tailored to the characteristics of various LMs.
arXiv Detail & Related papers (2024-02-21T16:15:20Z) - Calibrating Long-form Generations from Large Language Models [37.2496541665881]
Large Language Models' (LLMs) confidence scores should align with the actual likelihood of its responses being correct.
Current confidence elicitation methods and calibration metrics rely on a binary true/false assessment of response correctness.
We introduce a unified calibration framework, in which both the correctness of the LLMs' responses and their associated confidence levels are treated as distributions across a range of scores.
arXiv Detail & Related papers (2024-02-09T17:00:32Z) - On Task Performance and Model Calibration with Supervised and
Self-Ensembled In-Context Learning [71.44986275228747]
In-context learning (ICL) has become an efficient approach propelled by the recent advancements in large language models (LLMs)
However, both paradigms are prone to suffer from the critical problem of overconfidence (i.e., miscalibration)
arXiv Detail & Related papers (2023-12-21T11:55:10Z) - On Diversified Preferences of Large Language Model Alignment [51.26149027399505]
We investigate the impact of diversified preferences on reward modeling.
We find that diversified preference data negatively affect the calibration performance of reward models.
We propose a novel Multi-Objective Reward learning method to enhance the calibration performance of RMs on shared preferences.
arXiv Detail & Related papers (2023-12-12T16:17:15Z) - A Study on the Calibration of In-context Learning [27.533223818505682]
We study in-context learning (ICL), a prevalent method for adapting static language models through tailored prompts.
We observe that, with an increasing number of ICL examples, models initially exhibit increased miscalibration before achieving better calibration.
We explore recalibration techniques and find that a scaling-binning calibrator can reduce calibration errors consistently.
arXiv Detail & Related papers (2023-12-07T03:37:39Z) - On the Calibration of Large Language Models and Alignment [63.605099174744865]
Confidence calibration serves as a crucial tool for gauging the reliability of deep models.
We conduct a systematic examination of the calibration of aligned language models throughout the entire construction process.
Our work sheds light on whether popular LLMs are well-calibrated and how the training process influences model calibration.
arXiv Detail & Related papers (2023-11-22T08:57:55Z) - Calibrating Multimodal Learning [94.65232214643436]
We propose a novel regularization technique, i.e., Calibrating Multimodal Learning (CML) regularization, to calibrate the predictive confidence of previous methods.
This technique could be flexibly equipped by existing models and improve the performance in terms of confidence calibration, classification accuracy, and model robustness.
arXiv Detail & Related papers (2023-06-02T04:29:57Z) - Just Ask for Calibration: Strategies for Eliciting Calibrated Confidence
Scores from Language Models Fine-Tuned with Human Feedback [91.22679548111127]
A trustworthy real-world prediction system should produce well-calibrated confidence scores.
We show that verbalized confidences emitted as output tokens are typically better-calibrated than the model's conditional probabilities.
arXiv Detail & Related papers (2023-05-24T10:12:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.