Can LLMs Play Ô Ăn Quan Game? A Study of Multi-Step Planning and Decision Making
- URL: http://arxiv.org/abs/2507.03711v3
- Date: Wed, 09 Jul 2025 02:09:05 GMT
- Title: Can LLMs Play Ô Ăn Quan Game? A Study of Multi-Step Planning and Decision Making
- Authors: Sang Quang Nguyen, Kiet Van Nguyen, Vinh-Tiep Nguyen, Thanh Duc Ngo, Ngan Luu-Thuy Nguyen, Duy-Dinh Le,
- Abstract summary: We explore the ability of large language models (LLMs) to plan and make decisions through the lens of the traditional Vietnamese board game, O uAn Quan.<n>Specifically, we develop various agent personas, ranging from aggressive to defensive, and employ the O uAn Quan game as a testbed for assessing LLM performance across different strategies.
- Score: 3.827471128756051
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we explore the ability of large language models (LLMs) to plan and make decisions through the lens of the traditional Vietnamese board game, \^O \u{A}n Quan. This game, which involves a series of strategic token movements and captures, offers a unique environment for evaluating the decision-making and strategic capabilities of LLMs. Specifically, we develop various agent personas, ranging from aggressive to defensive, and employ the \^O \u{A}n Quan game as a testbed for assessing LLM performance across different strategies. Through experimentation with models like Llama-3.2-3B-Instruct, Llama-3.1-8B-Instruct, and Llama-3.3-70B-Instruct, we aim to understand how these models execute strategic decision-making, plan moves, and manage dynamic game states. The results will offer insights into the strengths and weaknesses of LLMs in terms of reasoning and strategy, contributing to a deeper understanding of their general capabilities.
Related papers
- A Multi-Agent Pokemon Tournament for Evaluating Strategic Reasoning of Large Language Models [0.0]
This research presents LLM Pokemon League, a competitive tournament system that leverages Large Language Models (LLMs) as intelligent agents to simulate strategic decision-making in Pok'emon battles.<n>The platform is designed to analyze and compare the reasoning, adaptability, and tactical depth exhibited by different LLMs in a type-based, turn-based combat environment.<n>The project enables rich exploration into comparative AI behavior, battle psychology, and meta-strategy development in constrained, rule-based game environments.
arXiv Detail & Related papers (2025-08-03T07:27:36Z) - KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation [78.96590724864606]
We introduce the Knowledge Orthogonal Reasoning Gymnasium (KORGym), a dynamic evaluation platform inspired by KOR-Bench and Gymnasium.<n>KORGym offers over fifty games in either textual or visual formats and supports interactive, multi-turn assessments with reinforcement learning scenarios.
arXiv Detail & Related papers (2025-05-20T16:06:32Z) - LLM-Gomoku: A Large Language Model-Based System for Strategic Gomoku with Self-Play and Reinforcement Learning [4.22453895366234]
This study aims to develop a Gomoku AI system based on large language models (LLMs)<n>The system is de-signed to understand and apply Gomoku strat-egies and logic to make rational decisions.<n>After extensive self-play training, the model's Gomoku-playing capabilities have been notably enhanced.
arXiv Detail & Related papers (2025-03-27T16:52:25Z) - Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search [32.657454056329875]
We propose a new method STRATEGIST that utilizes LLMs to acquire new skills for playing multi-agent games.
Our method gathers quality feedback through self-play simulations with Monte Carlo tree search.
We showcase how our method can be used in both action planning and dialogue generation in the context of games.
arXiv Detail & Related papers (2024-08-20T08:22:04Z) - GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations [87.99872683336395]
Large Language Models (LLMs) are integrated into critical real-world applications.
This paper evaluates LLMs' reasoning abilities in competitive environments.
We first propose GTBench, a language-driven environment composing 10 widely recognized tasks.
arXiv Detail & Related papers (2024-02-19T18:23:36Z) - K-Level Reasoning: Establishing Higher Order Beliefs in Large Language Models for Strategic Reasoning [76.3114831562989]
It requires Large Language Model (LLM) agents to adapt their strategies dynamically in multi-agent environments.
We propose a novel framework: "K-Level Reasoning with Large Language Models (K-R)"
arXiv Detail & Related papers (2024-02-02T16:07:05Z) - ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic
Decision-Making with AI Agents [77.34720446306419]
Alympics is a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research.
Alympics creates a versatile platform for studying complex game theory problems.
arXiv Detail & Related papers (2023-11-06T16:03:46Z) - Leveraging Word Guessing Games to Assess the Intelligence of Large
Language Models [105.39236338147715]
The paper is inspired by the popular language game Who is Spy''
We develop DEEP to evaluate LLMs' expression and disguising abilities.
We then introduce SpyGame, an interactive multi-agent framework.
arXiv Detail & Related papers (2023-10-31T14:37:42Z) - Strategic Reasoning with Language Models [35.63300060111918]
Strategic reasoning enables agents to cooperate, communicate, and compete with other agents in diverse situations.
Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining.
This paper introduces an approach that uses pretrained Large Language Models with few-shot chain-of-thought examples to enable strategic reasoning for AI agents.
arXiv Detail & Related papers (2023-05-30T16:09:19Z) - Introspective Tips: Large Language Model for In-Context Decision Making [48.96711664648164]
We employ Introspective Tips" to facilitate large language models (LLMs) in self-optimizing their decision-making.
Our method enhances the agent's performance in both few-shot and zero-shot learning situations.
Experiments involving over 100 games in TextWorld illustrate the superior performance of our approach.
arXiv Detail & Related papers (2023-05-19T11:20:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.