Large Language Models as Agents in Two-Player Games
- URL: http://arxiv.org/abs/2402.08078v1
- Date: Mon, 12 Feb 2024 21:44:32 GMT
- Title: Large Language Models as Agents in Two-Player Games
- Authors: Yang Liu, Peng Sun, Hang Li
- Abstract summary: This paper delineates the parallels between the training methods of large language models (LLMs) and the strategies employed for the development of agents in two-player games.
We propose a re-conceptualization of LLM learning processes in terms of agent learning in language-based games.
- Score: 12.303405412105187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: By formally defining the training processes of large language models (LLMs),
which usually encompasses pre-training, supervised fine-tuning, and
reinforcement learning with human feedback, within a single and unified machine
learning paradigm, we can glean pivotal insights for advancing LLM
technologies. This position paper delineates the parallels between the training
methods of LLMs and the strategies employed for the development of agents in
two-player games, as studied in game theory, reinforcement learning, and
multi-agent systems. We propose a re-conceptualization of LLM learning
processes in terms of agent learning in language-based games. This framework
unveils innovative perspectives on the successes and challenges in LLM
development, offering a fresh understanding of addressing alignment issues
among other strategic considerations. Furthermore, our two-player game approach
sheds light on novel data preparation and machine learning techniques for
training LLMs.
Related papers
- Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games [2.2648566044372416]
This paper explores the application of multimodal large language models (LLMs) as low-level controllers in the domain of Atari video games.
Unlike traditional reinforcement learning (RL) and imitation learning (IL) methods, these LLMs utilize pre-existing multimodal knowledge to directly engage with game environments.
arXiv Detail & Related papers (2024-08-28T17:08:56Z) - Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search [32.657454056329875]
We propose a new method STRATEGIST that utilizes LLMs to acquire new skills for playing multi-agent games.
Our method gathers quality feedback through self-play simulations with Monte Carlo tree search.
We showcase how our method can be used in both action planning and dialogue generation in the context of games.
arXiv Detail & Related papers (2024-08-20T08:22:04Z) - From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world.
Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting.
We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z) - Exploring the landscape of large language models: Foundations, techniques, and challenges [8.042562891309414]
The article sheds light on the mechanics of in-context learning and a spectrum of fine-tuning approaches.
It explores how LLMs can be more closely aligned with human preferences through innovative reinforcement learning frameworks.
The ethical dimensions of LLM deployment are discussed, underscoring the need for mindful and responsible application.
arXiv Detail & Related papers (2024-04-18T08:01:20Z) - Continual Learning for Large Language Models: A Survey [95.79977915131145]
Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale.
This paper surveys recent works on continual learning for LLMs.
arXiv Detail & Related papers (2024-02-02T12:34:09Z) - Understanding LLMs: A Comprehensive Overview from Training to Inference [52.70748499554532]
Low-cost training and deployment of large language models represent the future development trend.
Discussion on training includes various aspects, including data preprocessing, training architecture, pre-training tasks, parallel training, and relevant content related to model fine-tuning.
On the inference side, the paper covers topics such as model compression, parallel computation, memory scheduling, and structural optimization.
arXiv Detail & Related papers (2024-01-04T02:43:57Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - Leveraging Word Guessing Games to Assess the Intelligence of Large
Language Models [105.39236338147715]
The paper is inspired by the popular language game Who is Spy''
We develop DEEP to evaluate LLMs' expression and disguising abilities.
We then introduce SpyGame, an interactive multi-agent framework.
arXiv Detail & Related papers (2023-10-31T14:37:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.