Related papers: Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search

URL: http://arxiv.org/abs/2408.10635v2
Date: Sat, 12 Oct 2024 03:16:30 GMT
Title: Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search
Authors: Jonathan Light, Min Cai, Weiqin Chen, Guanzhi Wang, Xiusi Chen, Wei Cheng, Yisong Yue, Ziniu Hu,
Abstract summary: We propose a new method STRATEGIST that utilizes LLMs to acquire new skills for playing multi-agent games. Our method gathers quality feedback through self-play simulations with Monte Carlo tree search. We showcase how our method can be used in both action planning and dialogue generation in the context of games.
Score: 32.657454056329875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we propose a new method STRATEGIST that utilizes LLMs to acquire new skills for playing multi-agent games through a self-improvement process. Our method gathers quality feedback through self-play simulations with Monte Carlo tree search and LLM-based reflection, which can then be used to learn high-level strategic skills such as how to evaluate states that guide the low-level execution. We showcase how our method can be used in both action planning and dialogue generation in the context of games, achieving good performance on both tasks. Specifically, we demonstrate that our method can help train agents with better performance than both traditional reinforcement learning-based approaches and other LLM-based skill learning approaches in games including the Game of Pure Strategy (GOPS) and The Resistance: Avalon. STRATEGIST helps bridge the gap between foundation models and symbolic decision-making methods through its bi-level approach, leading to more robust decision-making.

Related papers

LLM-Gomoku: A Large Language Model-Based System for Strategic Gomoku with Self-Play and Reinforcement Learning [4.22453895366234]
This study aims to develop a Gomoku AI system based on large language models (LLMs) The system is de-signed to understand and apply Gomoku strat-egies and logic to make rational decisions. After extensive self-play training, the model's Gomoku-playing capabilities have been notably enhanced.
arXiv Detail & Related papers (2025-03-27T16:52:25Z)
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search [57.28671084993782]
Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse domains. Recent studies have shown that increasing test-time computation enhances LLMs' reasoning capabilities. We propose a two-stage training paradigm: 1) a small-scale format tuning stage to internalize the COAT reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning.
arXiv Detail & Related papers (2025-02-04T17:26:58Z)
LLM-PySC2: Starcraft II learning environment for Large Language Models [16.918044347226104]
This paper introduces a new environment that serves to develop Large Language Models (LLMs) based decision-making methodologies. This environment is the first to offer the complete StarCraft II action space, multi-modal observation interfaces, and a structured game knowledge database.
arXiv Detail & Related papers (2024-11-08T06:04:22Z)
From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process. We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z)
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning [70.16816087320585]
Monte Carlo Tree Search (MCTS) has emerged as a powerful technique for enhancing the reasoning capabilities of LLMs. Existing distillation methods underutilize the rich trajectory information generated by MCTS. We propose AlphaLLM-CPL, a novel pairwise training framework that enables LLMs to self-improve through MCTS behavior distillation.
arXiv Detail & Related papers (2024-10-09T03:20:02Z)
Learning Strategy Representation for Imitation Learning in Multi-Agent Games [15.209555810145549]
We introduce the Strategy Representation for Learning (STRIL) framework, which effectively learns strategy representations in multi-agent games. STRIL is a plug-in method that can be integrated into existing IL algorithms. We demonstrate the effectiveness of STRIL across competitive multi-agent scenarios, including Two-player Pong, Limit Texas Hold'em, and Connect Four.
arXiv Detail & Related papers (2024-09-28T14:30:17Z)
Using Advanced LLMs to Enhance Smaller LLMs: An Interpretable Knowledge Distillation Approach [6.154304269581415]
Advanced Large language models (LLMs) provide superior performance in complex human-like interactions. LLMs are costly, or too large for edge devices such as smartphones and harder to self-host, leading to security and privacy concerns. This paper introduces a novel interpretable knowledge distillation approach to enhance the performance of smaller, more economical LLMs.
arXiv Detail & Related papers (2024-08-13T23:59:36Z)
Large Language Models as Agents in Two-Player Games [12.303405412105187]
This paper delineates the parallels between the training methods of large language models (LLMs) and the strategies employed for the development of agents in two-player games. We propose a re-conceptualization of LLM learning processes in terms of agent learning in language-based games.
arXiv Detail & Related papers (2024-02-12T21:44:32Z)
K-Level Reasoning: Establishing Higher Order Beliefs in Large Language Models for Strategic Reasoning [76.3114831562989]
It requires Large Language Model (LLM) agents to adapt their strategies dynamically in multi-agent environments. We propose a novel framework: "K-Level Reasoning with Large Language Models (K-R)"
arXiv Detail & Related papers (2024-02-02T16:07:05Z)
ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic Decision-Making with AI Agents [77.34720446306419]
Alympics is a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research. Alympics creates a versatile platform for studying complex game theory problems.
arXiv Detail & Related papers (2023-11-06T16:03:46Z)
Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models [105.39236338147715]
The paper is inspired by the popular language game Who is Spy'' We develop DEEP to evaluate LLMs' expression and disguising abilities. We then introduce SpyGame, an interactive multi-agent framework.
arXiv Detail & Related papers (2023-10-31T14:37:42Z)
Strategic Reasoning with Language Models [35.63300060111918]
Strategic reasoning enables agents to cooperate, communicate, and compete with other agents in diverse situations. Existing approaches to solving strategic games rely on extensive training, yielding strategies that do not generalize to new scenarios or games without retraining. This paper introduces an approach that uses pretrained Large Language Models with few-shot chain-of-thought examples to enable strategic reasoning for AI agents.
arXiv Detail & Related papers (2023-05-30T16:09:19Z)
Rethinking Supervised Learning and Reinforcement Learning in Task-Oriented Dialogue Systems [58.724629408229205]
We demonstrate how traditional supervised learning and a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods. Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.
arXiv Detail & Related papers (2020-09-21T12:04:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.