LLM economicus? Mapping the Behavioral Biases of LLMs via Utility Theory
- URL: http://arxiv.org/abs/2408.02784v1
- Date: Mon, 5 Aug 2024 19:00:43 GMT
- Title: LLM economicus? Mapping the Behavioral Biases of LLMs via Utility Theory
- Authors: Jillian Ross, Yoon Kim, Andrew W. Lo,
- Abstract summary: Utility theory is an approach to evaluate the economic biases of large language models.
We find that the economic behavior of current LLMs is neither entirely human-like nor entirely economicus-like.
- Score: 20.79199807796242
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Humans are not homo economicus (i.e., rational economic beings). As humans, we exhibit systematic behavioral biases such as loss aversion, anchoring, framing, etc., which lead us to make suboptimal economic decisions. Insofar as such biases may be embedded in text data on which large language models (LLMs) are trained, to what extent are LLMs prone to the same behavioral biases? Understanding these biases in LLMs is crucial for deploying LLMs to support human decision-making. We propose utility theory-a paradigm at the core of modern economic theory-as an approach to evaluate the economic biases of LLMs. Utility theory enables the quantification and comparison of economic behavior against benchmarks such as perfect rationality or human behavior. To demonstrate our approach, we quantify and compare the economic behavior of a variety of open- and closed-source LLMs. We find that the economic behavior of current LLMs is neither entirely human-like nor entirely economicus-like. We also find that most current LLMs struggle to maintain consistent economic behavior across settings. Finally, we illustrate how our approach can measure the effect of interventions such as prompting on economic biases.
Related papers
- Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina [7.155982875107922]
Studies suggest large language models (LLMs) can exhibit human-like reasoning, aligning with human behavior in economic experiments, surveys, and political discourse.
This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research.
We assess the reasoning depth of LLMs using the 11-20 money request game.
arXiv Detail & Related papers (2024-10-25T14:46:07Z) - Gender Bias of LLM in Economics: An Existentialism Perspective [1.024113475677323]
This paper investigates gender bias in large language models (LLMs)
LLMs reinforce gender stereotypes even without explicit gender markers.
We argue that bias in LLMs is not an unintended flaw but a systematic result of their rational processing.
arXiv Detail & Related papers (2024-10-14T01:42:01Z) - GLEE: A Unified Framework and Benchmark for Language-based Economic Environments [19.366120861935105]
Large Language Models (LLMs) show significant potential in economic and strategic interactions.
These questions become crucial concerning the economic and societal implications of integrating LLM-based agents into real-world data-driven systems.
We introduce a benchmark for standardizing research on two-player, sequential, language-based games.
arXiv Detail & Related papers (2024-10-07T17:55:35Z) - EconNLI: Evaluating Large Language Models on Economics Reasoning [22.754757518792395]
Large Language Models (LLMs) are widely used for writing economic analysis reports or providing financial advice.
We propose a new dataset, natural language inference on economic events (EconNLI), to evaluate LLMs' knowledge and reasoning abilities in the economic domain.
Our experiments reveal that LLMs are not sophisticated in economic reasoning and may generate wrong or hallucinated answers.
arXiv Detail & Related papers (2024-07-01T11:58:24Z) - A Survey on Human Preference Learning for Large Language Models [81.41868485811625]
The recent surge of versatile large language models (LLMs) largely depends on aligning increasingly capable foundation models with human intentions by preference learning.
This survey covers the sources and formats of preference feedback, the modeling and usage of preference signals, as well as the evaluation of the aligned LLMs.
arXiv Detail & Related papers (2024-06-17T03:52:51Z) - Exploring Value Biases: How LLMs Deviate Towards the Ideal [57.99044181599786]
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact.
We show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
arXiv Detail & Related papers (2024-02-16T18:28:43Z) - Do LLM Agents Exhibit Social Behavior? [5.094340963261968]
State-Understanding-Value-Action (SUVA) is a framework to systematically analyze responses in social contexts.
It assesses social behavior through both their final decisions and the response generation processes leading to those decisions.
We demonstrate that utterance-based reasoning reliably predicts LLMs' final actions.
arXiv Detail & Related papers (2023-12-23T08:46:53Z) - CLOMO: Counterfactual Logical Modification with Large Language Models [109.60793869938534]
We introduce a novel task, Counterfactual Logical Modification (CLOMO), and a high-quality human-annotated benchmark.
In this task, LLMs must adeptly alter a given argumentative text to uphold a predetermined logical relationship.
We propose an innovative evaluation metric, the Self-Evaluation Score (SES), to directly evaluate the natural language output of LLMs.
arXiv Detail & Related papers (2023-11-29T08:29:54Z) - Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation.
We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process.
We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z) - Heterogeneous Value Alignment Evaluation for Large Language Models [91.96728871418]
Large Language Models (LLMs) have made it crucial to align their values with those of humans.
We propose a Heterogeneous Value Alignment Evaluation (HVAE) system to assess the success of aligning LLMs with heterogeneous values.
arXiv Detail & Related papers (2023-05-26T02:34:20Z) - The AI Economist: Optimal Economic Policy Design via Two-level Deep
Reinforcement Learning [126.37520136341094]
We show that machine-learning-based economic simulation is a powerful policy and mechanism design framework.
The AI Economist is a two-level, deep RL framework that trains both agents and a social planner who co-adapt.
In simple one-step economies, the AI Economist recovers the optimal tax policy of economic theory.
arXiv Detail & Related papers (2021-08-05T17:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.