ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees
- URL: http://arxiv.org/abs/2407.00499v3
- Date: Mon, 18 Nov 2024 08:33:35 GMT
- Title: ConU: Conformal Uncertainty in Large Language Models with Correctness Coverage Guarantees
- Authors: Zhiyuan Wang, Jinhao Duan, Lu Cheng, Yue Zhang, Qingni Wang, Xiaoshuang Shi, Kaidi Xu, Hengtao Shen, Xiaofeng Zhu,
- Abstract summary: We introduce a novel uncertainty measure based on self-consistency theory.
We then develop a conformal uncertainty criterion by integrating the uncertainty condition aligned with correctness into the CP algorithm.
Empirical evaluations indicate that our uncertainty measure outperforms prior state-of-the-art methods.
- Score: 68.33498595506941
- License:
- Abstract: Uncertainty quantification (UQ) in natural language generation (NLG) tasks remains an open challenge, exacerbated by the closed-source nature of the latest large language models (LLMs). This study investigates applying conformal prediction (CP), which can transform any heuristic uncertainty notion into rigorous prediction sets, to black-box LLMs in open-ended NLG tasks. We introduce a novel uncertainty measure based on self-consistency theory, and then develop a conformal uncertainty criterion by integrating the uncertainty condition aligned with correctness into the CP algorithm. Empirical evaluations indicate that our uncertainty measure outperforms prior state-of-the-art methods. Furthermore, we achieve strict control over the correctness coverage rate utilizing 7 popular LLMs on 4 free-form NLG datasets, spanning general-purpose and medical scenarios. Additionally, the calibrated prediction sets with small size further highlights the efficiency of our method in providing trustworthy guarantees for practical open-ended NLG applications.
Related papers
- Addressing Uncertainty in LLMs to Enhance Reliability in Generative AI [47.64301863399763]
We present a dynamic semantic clustering approach inspired by the Chinese Restaurant Process.
We quantify uncertainty of Large Language Models (LLMs) on a given query by calculating entropy of the generated semantic clusters.
We propose leveraging the (negative) likelihood of these clusters as the (non)conformity score within Conformal Prediction framework.
arXiv Detail & Related papers (2024-11-04T18:49:46Z) - Kernel Language Entropy: Fine-grained Uncertainty Quantification for LLMs from Semantic Similarities [79.9629927171974]
Uncertainty in Large Language Models (LLMs) is crucial for applications where safety and reliability are important.
We propose Kernel Language Entropy (KLE), a novel method for uncertainty estimation in white- and black-box LLMs.
arXiv Detail & Related papers (2024-05-30T12:42:05Z) - Conformal Prediction for Natural Language Processing: A Survey [23.638214012459425]
Conformal prediction is emerging as a theoretically sound and practically useful framework.
Its model-agnostic and distribution-free nature makes it particularly promising to address the current shortcomings of NLP systems.
This paper provides a comprehensive survey of conformal prediction techniques, their guarantees, and existing applications in NLP.
arXiv Detail & Related papers (2024-05-03T10:00:45Z) - Word-Sequence Entropy: Towards Uncertainty Estimation in Free-Form Medical Question Answering Applications and Beyond [52.246494389096654]
This paper introduces Word-Sequence Entropy (WSE), a method that calibrates uncertainty at both the word and sequence levels.
We compare WSE with six baseline methods on five free-form medical QA datasets, utilizing seven popular large language models (LLMs)
arXiv Detail & Related papers (2024-02-22T03:46:08Z) - Language Models with Conformal Factuality Guarantees [44.767328168194815]
Conformal factuality is a framework that can ensure high probability correctness guarantees for language model (LM) outputs.
We show that conformal prediction in language models corresponds to a back-off algorithm that provides high probability correctness guarantees.
arXiv Detail & Related papers (2024-02-15T18:31:53Z) - Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling [69.83976050879318]
In large language models (LLMs), identifying sources of uncertainty is an important step toward improving reliability, trustworthiness, and interpretability.
In this paper, we introduce an uncertainty decomposition framework for LLMs, called input clarification ensembling.
Our approach generates a set of clarifications for the input, feeds them into an LLM, and ensembles the corresponding predictions.
arXiv Detail & Related papers (2023-11-15T05:58:35Z) - Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models [37.63939774027709]
Large language models (LLMs) specializing in natural language generation (NLG) have recently started exhibiting promising capabilities.
We propose and compare several confidence/uncertainty measures, applying them to *selective NLG* where unreliable results could either be ignored or yielded for further assessment.
Results reveal that a simple measure for the semantic dispersion can be a reliable predictor of the quality of LLM responses.
arXiv Detail & Related papers (2023-05-30T16:31:26Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.