Large Language Models as Agents in Two-Player Games
- URL: http://arxiv.org/abs/2402.08078v1
- Date: Mon, 12 Feb 2024 21:44:32 GMT
- Title: Large Language Models as Agents in Two-Player Games
- Authors: Yang Liu, Peng Sun, Hang Li
- Abstract summary: This paper delineates the parallels between the training methods of large language models (LLMs) and the strategies employed for the development of agents in two-player games.
We propose a re-conceptualization of LLM learning processes in terms of agent learning in language-based games.
- Score: 12.303405412105187
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: By formally defining the training processes of large language models (LLMs),
which usually encompasses pre-training, supervised fine-tuning, and
reinforcement learning with human feedback, within a single and unified machine
learning paradigm, we can glean pivotal insights for advancing LLM
technologies. This position paper delineates the parallels between the training
methods of LLMs and the strategies employed for the development of agents in
two-player games, as studied in game theory, reinforcement learning, and
multi-agent systems. We propose a re-conceptualization of LLM learning
processes in terms of agent learning in language-based games. This framework
unveils innovative perspectives on the successes and challenges in LLM
development, offering a fresh understanding of addressing alignment issues
among other strategic considerations. Furthermore, our two-player game approach
sheds light on novel data preparation and machine learning techniques for
training LLMs.
Related papers
- LLM-PySC2: Starcraft II learning environment for Large Language Models [16.918044347226104]
This paper introduces a new environment that serves to develop Large Language Models (LLMs) based decision-making methodologies.
This environment is the first to offer the complete StarCraft II action space, multi-modal observation interfaces, and a structured game knowledge database.
arXiv Detail & Related papers (2024-11-08T06:04:22Z) - Strategist: Learning Strategic Skills by LLMs via Bi-Level Tree Search [32.657454056329875]
We propose a new method STRATEGIST that utilizes LLMs to acquire new skills for playing multi-agent games.
Our method gathers quality feedback through self-play simulations with Monte Carlo tree search.
We showcase how our method can be used in both action planning and dialogue generation in the context of games.
arXiv Detail & Related papers (2024-08-20T08:22:04Z) - From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world.
Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting.
We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z) - NoteLLM-2: Multimodal Large Representation Models for Recommendation [60.17448025069594]
We investigate the potential of Large Language Models to enhance multimodal representation in multimodal item-to-item recommendations.
One feasible method is the transfer of Multimodal Large Language Models (MLLMs) for representation tasks.
We propose a novel training framework, NoteLLM-2, specifically designed for multimodal representation.
arXiv Detail & Related papers (2024-05-27T03:24:01Z) - Exploring the landscape of large language models: Foundations, techniques, and challenges [8.042562891309414]
The article sheds light on the mechanics of in-context learning and a spectrum of fine-tuning approaches.
It explores how LLMs can be more closely aligned with human preferences through innovative reinforcement learning frameworks.
The ethical dimensions of LLM deployment are discussed, underscoring the need for mindful and responsible application.
arXiv Detail & Related papers (2024-04-18T08:01:20Z) - Continual Learning for Large Language Models: A Survey [95.79977915131145]
Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale.
This paper surveys recent works on continual learning for LLMs.
arXiv Detail & Related papers (2024-02-02T12:34:09Z) - Understanding LLMs: A Comprehensive Overview from Training to Inference [52.70748499554532]
Low-cost training and deployment of large language models represent the future development trend.
Discussion on training includes various aspects, including data preprocessing, training architecture, pre-training tasks, parallel training, and relevant content related to model fine-tuning.
On the inference side, the paper covers topics such as model compression, parallel computation, memory scheduling, and structural optimization.
arXiv Detail & Related papers (2024-01-04T02:43:57Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.