Strategic Behavior of Large Language Models: Game Structure vs.
Contextual Framing
- URL: http://arxiv.org/abs/2309.05898v1
- Date: Tue, 12 Sep 2023 00:54:15 GMT
- Title: Strategic Behavior of Large Language Models: Game Structure vs.
Contextual Framing
- Authors: Nunzio Lor\`e, Babak Heydari
- Abstract summary: This paper investigates the strategic decision-making capabilities of three Large Language Models (LLMs): GPT-3.5, GPT-4, and LLaMa-2.
Utilizing four canonical two-player games, we explore how these models navigate social dilemmas.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates the strategic decision-making capabilities of three
Large Language Models (LLMs): GPT-3.5, GPT-4, and LLaMa-2, within the framework
of game theory. Utilizing four canonical two-player games -- Prisoner's
Dilemma, Stag Hunt, Snowdrift, and Prisoner's Delight -- we explore how these
models navigate social dilemmas, situations where players can either cooperate
for a collective benefit or defect for individual gain. Crucially, we extend
our analysis to examine the role of contextual framing, such as diplomatic
relations or casual friendships, in shaping the models' decisions. Our findings
reveal a complex landscape: while GPT-3.5 is highly sensitive to contextual
framing, it shows limited ability to engage in abstract strategic reasoning.
Both GPT-4 and LLaMa-2 adjust their strategies based on game structure and
context, but LLaMa-2 exhibits a more nuanced understanding of the games'
underlying mechanics. These results highlight the current limitations and
varied proficiencies of LLMs in strategic decision-making, cautioning against
their unqualified use in tasks requiring complex strategic reasoning.
Related papers
- Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay [0.0]
We use games like Tic-Tac-Toe, Connect Four, and Battleship to assess strategic thinking and decision-making.
Despite their proficiency on standard benchmarks, GPT-3.5 and GPT-4's abilities to play and reason about fully observable games without pre-training is mediocre.
arXiv Detail & Related papers (2024-07-12T14:17:26Z) - Are Large Language Models Strategic Decision Makers? A Study of Performance and Bias in Two-Player Non-Zero-Sum Games [56.70628673595041]
Large Language Models (LLMs) have been increasingly used in real-world settings, yet their strategic decision-making abilities remain largely unexplored.
This work investigates the performance and merits of LLMs in canonical game-theoretic two-player non-zero-sum games, Stag Hunt and Prisoner Dilemma.
Our structured evaluation of GPT-3.5, GPT-4-Turbo, GPT-4o, and Llama-3-8B shows that these models, when making decisions in these games, are affected by at least one of the following systematic biases.
arXiv Detail & Related papers (2024-07-05T12:30:02Z) - GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents [4.209869303518743]
We introduce GameBench, a cross-domain benchmark for evaluating strategic reasoning abilities of large language models.
Our evaluations use GPT-3 and GPT-4 in their base form along with two scaffolding frameworks designed to enhance strategic reasoning ability: Chain-of-Thought (CoT) prompting and Reasoning Via Planning (RAP)
Our results show that none of the tested models match human performance, and at worst GPT-4 performs worse than random action.
arXiv Detail & Related papers (2024-06-07T00:28:43Z) - GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations [87.99872683336395]
Large Language Models (LLMs) are integrated into critical real-world applications.
This paper evaluates LLMs' reasoning abilities in competitive environments.
We first propose GTBench, a language-driven environment composing 10 widely recognized tasks.
arXiv Detail & Related papers (2024-02-19T18:23:36Z) - K-Level Reasoning: Establishing Higher Order Beliefs in Large Language Models for Strategic Reasoning [76.3114831562989]
It requires Large Language Model (LLM) agents to adapt their strategies dynamically in multi-agent environments.
We propose a novel framework: "K-Level Reasoning with Large Language Models (K-R)"
arXiv Detail & Related papers (2024-02-02T16:07:05Z) - CivRealm: A Learning and Reasoning Odyssey in Civilization for
Decision-Making Agents [63.79739920174535]
We introduce CivRealm, an environment inspired by the Civilization game.
CivRealm stands as a unique learning and reasoning challenge for decision-making agents.
arXiv Detail & Related papers (2024-01-19T09:14:11Z) - ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic
Decision-Making with AI Agents [77.34720446306419]
Alympics is a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research.
Alympics creates a versatile platform for studying complex game theory problems.
arXiv Detail & Related papers (2023-11-06T16:03:46Z) - SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM)
In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment.
Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.