Rational Decision-Making Agent with Internalized Utility Judgment
- URL: http://arxiv.org/abs/2308.12519v2
- Date: Wed, 17 Jan 2024 13:09:30 GMT
- Title: Rational Decision-Making Agent with Internalized Utility Judgment
- Authors: Yining Ye, Xin Cong, Shizuo Tian, Yujia Qin, Chong Liu, Yankai Lin,
Zhiyuan Liu, Maosong Sun
- Abstract summary: Large language models (LLMs) have demonstrated remarkable advancements and have attracted significant efforts to develop LLMs into agents capable of executing intricate multi-step decision-making tasks beyond traditional NLP applications.
This paper proposes RadAgent, which fosters the development of its rationality through an iterative framework involving Experience Exploration and Utility Learning.
Experimental results on the ToolBench dataset demonstrate RadAgent's superiority over baselines, achieving over 10% improvement in Pass Rate on diverse tasks.
- Score: 91.80700126895927
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large language models (LLMs) have demonstrated remarkable advancements and
have attracted significant efforts to develop LLMs into agents capable of
executing intricate multi-step decision-making tasks beyond traditional NLP
applications. Existing approaches to LLM-based decision-making predominantly
build upon the manually-designed external performance metrics to guide the
decision-making process. However, reliance on the external performance metrics
as prior is problematic in real-world scenarios, where such prior may be
unavailable, flawed, or even erroneous. For genuine autonomous decision making,
it is imperative for the agent to develop its rationality from its posterior
experiences to judge decisions independently. Central to the development of
rationality is the construction of an internalized utility judgment, capable of
assigning numerical utilities to each decision. This paper proposes RadAgent
(Rational Decision-Making Agent), which fosters the development of its
rationality through an iterative framework involving Experience Exploration and
Utility Learning. Within this framework, Elo-based Utility Construction is
devised to assign Elo scores to individual decision steps to judge their
utilities via pairwise comparisons. Consequently, these Elo scores guide the
decision-making process to derive optimal outcomes. Experimental results on the
ToolBench dataset demonstrate RadAgent's superiority over baselines, achieving
over 10% improvement in Pass Rate on diverse tasks. It offers higher-quality
solutions and reduces costs (ChatGPT API calls), highlighting its effectiveness
and efficiency.
Related papers
- Unveiling and Consulting Core Experts in Retrieval-Augmented MoE-based LLMs [64.9693406713216]
Internal mechanisms that contribute to the effectiveness of RAG systems remain underexplored.
Our experiments reveal that several core groups of experts are primarily responsible for RAG-related behaviors.
We propose several strategies to enhance RAG's efficiency and effectiveness through expert activation.
arXiv Detail & Related papers (2024-10-20T16:08:54Z) - Cognitive LLMs: Towards Integrating Cognitive Architectures and Large Language Models for Manufacturing Decision-making [51.737762570776006]
LLM-ACTR is a novel neuro-symbolic architecture that provides human-aligned and versatile decision-making.
Our framework extracts and embeds knowledge of ACT-R's internal decision-making process as latent neural representations.
Our experiments on novel Design for Manufacturing tasks show both improved task performance as well as improved grounded decision-making capability.
arXiv Detail & Related papers (2024-08-17T11:49:53Z) - Balancing Rigor and Utility: Mitigating Cognitive Biases in Large Language Models for Multiple-Choice Questions [0.46873264197900916]
This paper examines the role of cognitive biases in the decision-making processes of large language models (LLMs)
We show that certain cognitive biases when properly balanced, can enhance decision-making efficiency through rational deviations and shortcuts.
arXiv Detail & Related papers (2024-06-16T16:25:22Z) - DeLLMa: Decision Making Under Uncertainty with Large Language Models [31.77731889916652]
DeLLMa is a framework designed to enhance decision-making accuracy in uncertain environments.
We show that DeLLMa can consistently enhance the decision-making performance of leading language models, and achieve up to a 40% increase in accuracy over competing methods.
arXiv Detail & Related papers (2024-02-04T08:11:45Z) - AgentBoard: An Analytical Evaluation Board of Multi-turn LLM Agents [76.95062553043607]
evaluating large language models (LLMs) is essential for understanding their capabilities and facilitating their integration into practical applications.
We introduce AgentBoard, a pioneering comprehensive benchmark and accompanied open-source evaluation framework tailored to analytical evaluation of LLM agents.
arXiv Detail & Related papers (2024-01-24T01:51:00Z) - Make Your Decision Convincing! A Unified Two-Stage Framework:
Self-Attribution and Decision-Making [24.906886146275127]
We propose a unified two-stage framework known as Self-Attribution and Decision-Making (SADM)
We demonstrate that our framework not only establishes a more reliable link between the generated rationale and model decision but also achieves competitive results in task performance and the quality of rationale.
arXiv Detail & Related papers (2023-10-20T15:59:57Z) - Explainability's Gain is Optimality's Loss? -- How Explanations Bias
Decision-making [0.0]
Explanations help to facilitate communication between the algorithm and the human decision-maker.
Feature-based explanations' semantics of causal models induce leakage from the decision-maker's prior beliefs.
Such differences can lead to sub-optimal and biased decision outcomes.
arXiv Detail & Related papers (2022-06-17T11:43:42Z) - Inverse Online Learning: Understanding Non-Stationary and Reactionary
Policies [79.60322329952453]
We show how to develop interpretable representations of how agents make decisions.
By understanding the decision-making processes underlying a set of observed trajectories, we cast the policy inference problem as the inverse to this online learning problem.
We introduce a practical algorithm for retrospectively estimating such perceived effects, alongside the process through which agents update them.
Through application to the analysis of UNOS organ donation acceptance decisions, we demonstrate that our approach can bring valuable insights into the factors that govern decision processes and how they change over time.
arXiv Detail & Related papers (2022-03-14T17:40:42Z) - Leveraging Expert Consistency to Improve Algorithmic Decision Support [62.61153549123407]
We explore the use of historical expert decisions as a rich source of information that can be combined with observed outcomes to narrow the construct gap.
We propose an influence function-based methodology to estimate expert consistency indirectly when each case in the data is assessed by a single expert.
Our empirical evaluation, using simulations in a clinical setting and real-world data from the child welfare domain, indicates that the proposed approach successfully narrows the construct gap.
arXiv Detail & Related papers (2021-01-24T05:40:29Z) - Causal Strategic Linear Regression [5.672132510411465]
In many predictive decision-making scenarios, such as credit scoring and academic testing, a decision-maker must construct a model that accounts for agents' propensity to "game" the decision rule.
We join concurrent work in modeling agents' outcomes as a function of their changeable attributes.
We provide efficient algorithms for learning decision rules that optimize three distinct decision-maker objectives.
arXiv Detail & Related papers (2020-02-24T03:57:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.