Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game
- URL: http://arxiv.org/abs/2501.14225v1
- Date: Fri, 24 Jan 2025 04:09:03 GMT
- Title: Multi-agent KTO: Reinforcing Strategic Interactions of Large Language Model in Language Game
- Authors: Rong Ye, Yongxin Zhang, Yikai Zhang, Haoyu Kuang, Zhongyu Wei, Peng Sun,
- Abstract summary: Werewolf is a social deduction game that tests language understanding.
We develop the Multi-agent Kahneman & Tversky's Optimization (MaKTO)
MaKTO achieves a 61% average win rate across various models.
- Score: 32.791648070823776
- License:
- Abstract: Achieving Artificial General Intelligence (AGI) requires AI agents that can not only make stratigic decisions but also engage in flexible and meaningful communication. Inspired by Wittgenstein's language game theory in Philosophical Investigations, we propose that language agents can learn through in-context interaction rather than traditional multi-stage frameworks that separate decision-making from language expression. Using Werewolf, a social deduction game that tests language understanding, strategic interaction, and adaptability, we develop the Multi-agent Kahneman & Tversky's Optimization (MaKTO). MaKTO engages diverse models in extensive gameplay to generate unpaired desirable and unacceptable responses, then employs KTO to refine the model's decision-making process. In 9-player Werewolf games, MaKTO achieves a 61% average win rate across various models, outperforming GPT-4o and two-stage RL agents by relative improvements of 23.0% and 10.9%, respectively. Notably, MaKTO also demonstrates human-like performance, winning 60% against expert players and showing only 49% detectability in Turing-style blind tests. These results showcase MaKTO's superior decision-making, strategic adaptation, and natural language generation in complex social deduction games.
Related papers
- Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization [13.496120603859701]
Large language model (LLM)-based agents have recently shown impressive progress in a variety of domains.
Applying these agents to social deduction games such as Werewolf, which requires both strategic decision-making and free-form language interaction, remains non-trivial.
We propose Latent Space Policy Optimization (LSPO), an iterative framework that addresses these challenges by first mapping free-form text to a discrete latent space.
arXiv Detail & Related papers (2025-02-07T06:19:55Z) - GAMA: Generative Agents for Multi-Agent Autoformalization [3.5083201638203154]
We present a framework that enables the autoformalization of interaction scenarios using agents augmented by large language models (LLMs)
The agents translate natural language descriptions of interactions into executable logic programs that define the rules of each game.
A tournament simulation then tests the functionality of the generated game rules and strategies.
arXiv Detail & Related papers (2024-12-11T22:37:45Z) - Policy Learning with a Language Bottleneck [65.99843627646018]
Policy Learning with a Language Bottleneck (PLLBB) is a framework enabling AI agents to generate linguistic rules.
PLLBB alternates between a rule generation step guided by language models, and an update step where agents learn new policies guided by rules.
In a two-player communication game, a maze solving task, and two image reconstruction tasks, we show thatPLLBB agents are not only able to learn more interpretable and generalizable behaviors, but can also share the learned rules with human users.
arXiv Detail & Related papers (2024-05-07T08:40:21Z) - Steering Language Models with Game-Theoretic Solvers [43.023261136434876]
We introduce a framework that allows equilibrium solvers to work over the space of natural language dialogue generated by large language models (LLMs)
Specifically, by modelling the players, strategies and payoffs in a "game" of dialogue, we create a binding from natural language interactions to the conventional symbolic logic of game theory.
We focus on three domains that require different negotiation strategies: scheduling meetings, trading fruit and debate, and evaluate an LLM's generated language when guided by solvers.
arXiv Detail & Related papers (2024-01-24T22:22:00Z) - ALYMPICS: LLM Agents Meet Game Theory -- Exploring Strategic
Decision-Making with AI Agents [77.34720446306419]
Alympics is a systematic simulation framework utilizing Large Language Model (LLM) agents for game theory research.
Alympics creates a versatile platform for studying complex game theory problems.
arXiv Detail & Related papers (2023-11-06T16:03:46Z) - Breaking Language Barriers in Multilingual Mathematical Reasoning: Insights and Observations [59.056367787688146]
This paper pioneers exploring and training powerful Multilingual Math Reasoning (xMR) LLMs.
We construct the first multilingual math reasoning instruction dataset, MGSM8KInstruct, encompassing ten distinct languages.
By utilizing translation, we construct the first multilingual math reasoning instruction dataset, MGSM8KInstruct, encompassing ten distinct languages.
arXiv Detail & Related papers (2023-10-31T08:09:20Z) - Language Agents with Reinforcement Learning for Strategic Play in the
Werewolf Game [40.438765131992525]
We develop strategic language agents that generate flexible language actions and possess strong decision-making abilities.
To mitigate the intrinsic bias in language actions, our agents use an LLM to perform deductive reasoning and generate a diverse set of action candidates.
Experiments show that our agents overcome the intrinsic bias and outperform existing LLM-based agents in the Werewolf game.
arXiv Detail & Related papers (2023-10-29T09:02:57Z) - Human Choice Prediction in Language-based Persuasion Games:
Simulation-based Off-Policy Evaluation [24.05034588588407]
This paper addresses a key aspect in the design of such agents: Predicting human decision in off-policy evaluation.
We collected a dataset of 87K decisions from humans playing a repeated decision-making game with artificial agents.
Our approach involves training a model on human interactions with one agents subset to predict decisions when interacting with another.
arXiv Detail & Related papers (2023-05-17T16:38:11Z) - Computational Language Acquisition with Theory of Mind [84.2267302901888]
We build language-learning agents equipped with Theory of Mind (ToM) and measure its effects on the learning process.
We find that training speakers with a highly weighted ToM listener component leads to performance gains in our image referential game setting.
arXiv Detail & Related papers (2023-03-02T18:59:46Z) - Quality Assurance of Generative Dialog Models in an Evolving
Conversational Agent Used for Swedish Language Practice [59.705062519344]
One proposed solution involves AI-enabled conversational agents for person-centered interactive language practice.
We present results from ongoing action research targeting quality assurance of proprietary generative dialog models trained for virtual job interviews.
arXiv Detail & Related papers (2022-03-29T10:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.