Mastering Da Vinci Code: A Comparative Study of Transformer, LLM, and PPO-based Agents
- URL: http://arxiv.org/abs/2506.12801v1
- Date: Sun, 15 Jun 2025 10:33:30 GMT
- Title: Mastering Da Vinci Code: A Comparative Study of Transformer, LLM, and PPO-based Agents
- Authors: LeCheng Zhang, Yuanshi Wang, Haotian Shen, Xujie Wang,
- Abstract summary: The Da Vinci Code, a game of logical deduction and imperfect information, presents unique challenges for artificial intelligence.<n>This paper investigates the efficacy of various AI paradigms in mastering this game.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Da Vinci Code, a game of logical deduction and imperfect information, presents unique challenges for artificial intelligence, demanding nuanced reasoning beyond simple pattern recognition. This paper investigates the efficacy of various AI paradigms in mastering this game. We develop and evaluate three distinct agent architectures: a Transformer-based baseline model with limited historical context, several Large Language Model (LLM) agents (including Gemini, DeepSeek, and GPT variants) guided by structured prompts, and an agent based on Proximal Policy Optimization (PPO) employing a Transformer encoder for comprehensive game history processing. Performance is benchmarked against the baseline, with the PPO-based agent demonstrating superior win rates ($58.5\% \pm 1.0\%$), significantly outperforming the LLM counterparts. Our analysis highlights the strengths of deep reinforcement learning in policy refinement for complex deductive tasks, particularly in learning implicit strategies from self-play. We also examine the capabilities and inherent limitations of current LLMs in maintaining strict logical consistency and strategic depth over extended gameplay, despite sophisticated prompting. This study contributes to the broader understanding of AI in recreational games involving hidden information and multi-step logical reasoning, offering insights into effective agent design and the comparative advantages of different AI approaches.
Related papers
- Agents of Change: Self-Evolving LLM Agents for Strategic Planning [17.67637003848376]
We benchmark a progression of LLM-based agents, from a simple game-playing agent to systems capable of autonomously rewriting their own prompts and their player agent's code.<n>Our results show that self-evolving agents, particularly when powered by models like Claude 3.7 and GPT-4o, outperform static baselines by autonomously adopting their strategies.
arXiv Detail & Related papers (2025-06-05T05:45:24Z) - The Influence of Human-inspired Agentic Sophistication in LLM-driven Strategic Reasoners [3.5083201638203154]
We evaluate the role of agentic sophistication in shaping artificial reasoners' performance.<n>We benchmarked three agent designs: a simple game-theoretic model, an unstructured LLM-as-agent model, and an LLM integrated into a traditional agentic framework.<n>Our analysis, covering over 2000 reasoning samples across 25 agent configurations, shows that human-inspired cognitive structures can enhance LLM agents' alignment with human strategic behaviour.
arXiv Detail & Related papers (2025-05-14T13:51:24Z) - FAIRGAME: a Framework for AI Agents Bias Recognition using Game Theory [51.96049148869987]
We present FAIRGAME, a Framework for AI Agents Bias Recognition using Game Theory.<n>We describe its implementation and usage, and we employ it to uncover biased outcomes in popular games among AI agents.<n>Overall, FAIRGAME allows users to reliably and easily simulate their desired games and scenarios.
arXiv Detail & Related papers (2025-04-19T15:29:04Z) - A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems [93.8285345915925]
Reasoning is a fundamental cognitive process that enables logical inference, problem-solving, and decision-making.<n>With the rapid advancement of large language models (LLMs), reasoning has emerged as a key capability that distinguishes advanced AI systems.<n>We categorize existing methods along two dimensions: (1) Regimes, which define the stage at which reasoning is achieved; and (2) Architectures, which determine the components involved in the reasoning process.
arXiv Detail & Related papers (2025-04-12T01:27:49Z) - Reinforcement Learning Environment with LLM-Controlled Adversary in D&D 5th Edition Combat [0.0]
This research employs Deep Q-Networks (DQN) for the smaller agents, creating a testbed for strategic AI development.<n>We successfully integrated sophisticated language models into the RL framework, enhancing strategic decision-making processes.
arXiv Detail & Related papers (2025-03-19T22:48:20Z) - ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning [53.817538122688944]
We introduce Reinforced Meta-thinking Agents (ReMA) to elicit meta-thinking behaviors from Reasoning of Large Language Models (LLMs)<n>ReMA decouples the reasoning process into two hierarchical agents: a high-level meta-thinking agent responsible for generating strategic oversight and plans, and a low-level reasoning agent for detailed executions.<n> Empirical results from single-turn experiments demonstrate that ReMA outperforms single-agent RL baselines on complex reasoning tasks.
arXiv Detail & Related papers (2025-03-12T16:05:31Z) - Approximating Human Strategic Reasoning with LLM-Enhanced Recursive Reasoners Leveraging Multi-agent Hypergames [3.5083201638203154]
We implement a role-based multi-agent strategic interaction framework tailored to sophisticated reasoners.<n>We use one-shot, 2-player beauty contests to evaluate the reasoning capabilities of the latest LLMs.<n>Our experiments show that artificial reasoners can outperform the baseline model in terms of both approximating human behaviour and reaching the optimal solution.
arXiv Detail & Related papers (2025-02-11T10:37:20Z) - Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models [64.1799100754406]
Large Language Models (LLMs) demonstrate enhanced capabilities and reliability by reasoning more.<n>Despite various efforts to improve LLM reasoning, high-quality long-chain reasoning data and optimized training pipelines still remain inadequately explored in vision-language tasks.<n>We present Insight-V, an early effort to 1) scalably produce long and robust reasoning data for complex multi-modal tasks, and 2) an effective training pipeline to enhance the reasoning capabilities of MLLMs.
arXiv Detail & Related papers (2024-11-21T18:59:55Z) - Game-theoretic LLM: Agent Workflow for Negotiation Games [30.83905391503607]
This paper investigates the rationality of large language models (LLMs) in strategic decision-making contexts.
We design multiple game-theoretic that guide the reasoning and decision-making processes of LLMs.
The findings have implications for the development of more robust and strategically sound AI agents.
arXiv Detail & Related papers (2024-11-08T22:02:22Z) - From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.<n>We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization [53.510942601223626]
Large Language Models (LLMs) exhibit robust problem-solving capabilities for diverse tasks.
These task solvers necessitate manually crafted prompts to inform task rules and regulate behaviors.
We propose Agent-Pro: an LLM-based Agent with Policy-level Reflection and Optimization.
arXiv Detail & Related papers (2024-02-27T15:09:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.