Efficient Non-Parametric Uncertainty Quantification for Black-Box Large
Language Models and Decision Planning
- URL: http://arxiv.org/abs/2402.00251v1
- Date: Thu, 1 Feb 2024 00:23:31 GMT
- Title: Efficient Non-Parametric Uncertainty Quantification for Black-Box Large
Language Models and Decision Planning
- Authors: Yao-Hung Hubert Tsai, Walter Talbott, Jian Zhang
- Abstract summary: This paper focuses on decision planning with uncertainty estimation to address the problem in language models.
Our uncertainty estimation and decision-making agent design offer a cost-efficient approach for AI agent development.
- Score: 17.752461521448236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Step-by-step decision planning with large language models (LLMs) is gaining
attention in AI agent development. This paper focuses on decision planning with
uncertainty estimation to address the hallucination problem in language models.
Existing approaches are either white-box or computationally demanding, limiting
use of black-box proprietary LLMs within budgets. The paper's first
contribution is a non-parametric uncertainty quantification method for LLMs,
efficiently estimating point-wise dependencies between input-decision on the
fly with a single inference, without access to token logits. This estimator
informs the statistical interpretation of decision trustworthiness. The second
contribution outlines a systematic design for a decision-making agent,
generating actions like ``turn on the bathroom light'' based on user prompts
such as ``take a bath''. Users will be asked to provide preferences when more
than one action has high estimated point-wise dependencies. In conclusion, our
uncertainty estimation and decision-making agent design offer a cost-efficient
approach for AI agent development.
Related papers
- MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs [55.20845457594977]
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making.
We present a process-based benchmark MR-Ben that demands a meta-reasoning skill.
Our meta-reasoning paradigm is especially suited for system-2 slow thinking.
arXiv Detail & Related papers (2024-06-20T03:50:23Z) - Modeling Boundedly Rational Agents with Latent Inference Budgets [56.24971011281947]
We introduce a latent inference budget model (L-IBM) that models agents' computational constraints explicitly.
L-IBMs make it possible to learn agent models using data from diverse populations of suboptimal actors.
We show that L-IBMs match or outperform Boltzmann models of decision-making under uncertainty.
arXiv Detail & Related papers (2023-12-07T03:55:51Z) - Uncertainty-aware Language Modeling for Selective Question Answering [107.47864420630923]
We present an automatic large language model (LLM) conversion approach that produces uncertainty-aware LLMs.
Our approach is model- and data-agnostic, is computationally-efficient, and does not rely on external models or systems.
arXiv Detail & Related papers (2023-11-26T22:47:54Z) - Rational Decision-Making Agent with Internalized Utility Judgment [91.80700126895927]
Large language models (LLMs) have demonstrated remarkable advancements and have attracted significant efforts to develop LLMs into agents capable of executing intricate multi-step decision-making tasks beyond traditional NLP applications.
This paper proposes RadAgent, which fosters the development of its rationality through an iterative framework involving Experience Exploration and Utility Learning.
Experimental results on the ToolBench dataset demonstrate RadAgent's superiority over baselines, achieving over 10% improvement in Pass Rate on diverse tasks.
arXiv Detail & Related papers (2023-08-24T03:11:45Z) - A Meta-heuristic Approach to Estimate and Explain Classifier Uncertainty [0.4264192013842096]
This work proposes a set of class-independent meta-heuristics that can characterize the complexity of an instance in terms of factors are mutually relevant to both human and machine learning decision-making.
The proposed measures and framework hold promise for improving model development for more complex instances, as well as providing a new means of model abstention and explanation.
arXiv Detail & Related papers (2023-04-20T13:09:28Z) - Double Fuzzy Probabilistic Interval Linguistic Term Set and a Dynamic
Fuzzy Decision Making Model based on Markov Process with tts Application in
Multiple Criteria Group Decision Making [0.0]
Probable linguistic term has been proposed to deal with probability distributions in provided linguistic evaluations.
Weight information plays a significant role in dynamic information fusion and decision making process.
I propose the concept of double fuzzy probability interval linguistic term set (DFPILTS)
arXiv Detail & Related papers (2021-11-30T10:17:08Z) - Ensemble Quantile Networks: Uncertainty-Aware Reinforcement Learning
with Applications in Autonomous Driving [1.6758573326215689]
Reinforcement learning can be used to create a decision-making agent for autonomous driving.
Previous approaches provide only black-box solutions, which do not offer information on how confident the agent is about its decisions.
This paper introduces the Ensemble Quantile Networks (EQN) method, which combines distributional RL with an ensemble approach to obtain a complete uncertainty estimate.
arXiv Detail & Related papers (2021-05-21T10:36:16Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Uncertainty as a Form of Transparency: Measuring, Communicating, and
Using Uncertainty [66.17147341354577]
We argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions.
We describe how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems.
This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness.
arXiv Detail & Related papers (2020-11-15T17:26:14Z) - Value of Information Analysis via Active Learning and Knowledge Sharing
in Error-Controlled Adaptive Kriging [7.148732567427574]
This paper proposes the first surrogate-based framework for value of information (VoI) analysis.
It affords sharing equality-type information from observations among surrogate models to update likelihoods of multiple events of interest.
The proposed VoI analysis framework is applied for an optimal decision-making problem involving load testing of a truss bridge.
arXiv Detail & Related papers (2020-02-06T16:58:27Z) - Dirichlet uncertainty wrappers for actionable algorithm accuracy
accountability and auditability [0.5156484100374058]
We propose a wrapper that enriches its output prediction with a measure of uncertainty.
Based on the resulting uncertainty measure, we advocate for a rejection system that selects the more confident predictions.
Results demonstrate the effectiveness of the uncertainty computed by the wrapper.
arXiv Detail & Related papers (2019-12-29T11:05:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.