Language-Driven Coordination and Learning in Multi-Agent Simulation Environments
- URL: http://arxiv.org/abs/2506.04251v4
- Date: Mon, 03 Nov 2025 07:37:51 GMT
- Title: Language-Driven Coordination and Learning in Multi-Agent Simulation Environments
- Authors: Zhengyang Li, Sawyer Campos, Nana Wang,
- Abstract summary: This paper introduces a unified framework that incorporates large language models (LLMs) into multi-agent reinforcement learning (MARL)<n>The framework features three modular components of Coordinator, Communicator, and Memory, which dynamically generate subgoals.<n>It is evaluated in Google Research Football, MAgent Battle, and StarCraft II.
- Score: 5.139001420948807
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces LLM-MARL, a unified framework that incorporates large language models (LLMs) into multi-agent reinforcement learning (MARL) to enhance coordination, communication, and generalization in simulated game environments. The framework features three modular components of Coordinator, Communicator, and Memory, which dynamically generate subgoals, facilitate symbolic inter-agent messaging, and support episodic recall. Training combines PPO with a language-conditioned loss and LLM query gating. LLM-MARL is evaluated in Google Research Football, MAgent Battle, and StarCraft II. Results show consistent improvements over MAPPO and QMIX in win rate, coordination score, and zero-shot generalization. Ablation studies demonstrate that subgoal generation and language-based messaging each contribute significantly to performance gains. Qualitative analysis reveals emergent behaviors such as role specialization and communication-driven tactics. By bridging language modeling and policy learning, this work contributes to the design of intelligent, cooperative agents in interactive simulations. It offers a path forward for leveraging LLMs in multi-agent systems used for training, games, and human-AI collaboration.
Related papers
- GameTalk: Training LLMs for Strategic Conversation [51.29670609281524]
We introduce textbfGameTalk, a framework for training LLMs to make strategic decisions via multi-turn interactions.<n>Unlike prior work that focuses on single-turn objectives or static action prediction, we train LLMs to optimize a global objective across full conversations.<n>We evaluate this approach on a suite of increasingly complex games, designed to stress different aspects of reasoning, coordination, and opponent modeling.
arXiv Detail & Related papers (2026-01-22T19:18:39Z) - Bridging the Knowledge Void: Inference-time Acquisition of Unfamiliar Programming Languages for Coding Tasks [22.908904483320953]
Large Language Models (LLMs) in coding tasks are often a reflection of their extensive pre-training corpora.<n>We propose ILA-agent, a general ILA framework that equips LLMs with a set of behavioral primitives.<n>We instantiate ILA-agent for Cangjie and evaluate its performance across code generation, translation, and program repair tasks.
arXiv Detail & Related papers (2026-01-16T09:06:47Z) - LinguaGame: A Linguistically Grounded Game-Theoretic Paradigm for Multi-Agent Dialogue Generation [17.584631586928815]
We propose a linguistically-grounded game-theoretic paradigm for multi-agent dialogue generation.<n>Our framework relies on linguistically informed reasoning with minimal task-specific coupling.<n>We evaluate our framework in simulated courtroom proceedings and debates, with human expert assessments showing significant gains in communication efficiency.
arXiv Detail & Related papers (2026-01-08T02:30:43Z) - Sample-Efficient Online Learning in LM Agents via Hindsight Trajectory Rewriting [92.57796055887995]
We introduce ECHO, a prompting framework that adapts hindsight experience replay from reinforcement learning for language model agents.<n> ECHO generates optimized trajectories for alternative goals that could have been achieved during failed attempts.<n>We evaluate ECHO on stateful versions of XMiniGrid, a text-based navigation and planning benchmark, and PeopleJoinQA, a collaborative information-gathering enterprise simulation.
arXiv Detail & Related papers (2025-10-11T18:11:09Z) - Grounding Natural Language for Multi-agent Decision-Making with Multi-agentic LLMs [10.186029242664931]
We extend the capabilities of large language models (LLMs) by integrating them with advancements in multi-agent decision-making algorithms.<n>We propose a systematic framework for the design of multi-agentic large language models (LLMs), focusing on key integration practices.
arXiv Detail & Related papers (2025-08-10T19:53:23Z) - The Unreasonable Effectiveness of Model Merging for Cross-Lingual Transfer in LLMs [54.59207567677249]
Large language models (LLMs) still struggle across tasks outside of high-resource languages.<n>In this work, we investigate cross-lingual transfer to lower-resource languages where task-specific post-training data is scarce.
arXiv Detail & Related papers (2025-05-23T20:28:31Z) - Enhancing Multi-Agent Systems via Reinforcement Learning with LLM-based Planner and Graph-based Policy [31.041340552853004]
Graph Collaboration MARL (LGC-MARL) is a framework that efficiently combines Large Language Models (LLMs) and Multi-Agent Reinforcement Learning (MARL)<n>LGC-MARL decomposes complex tasks into executable subtasks and achieves efficient collaboration among multiple agents through graph-based coordination.<n> Experimental results on the AI2-THOR simulation platform demonstrate the superior performance and scalability of LGC-MARL.
arXiv Detail & Related papers (2025-03-13T05:02:49Z) - Cooperative Multi-Agent Planning with Adaptive Skill Synthesis [16.228784877899976]
We present a novel multi-agent architecture that integrates vision-language models (VLMs) with a dynamic skill library and structured communication for decentralized closed-loop decision-making.<n>The skill library, bootstrapped from demonstrations, evolves via planner-guided tasks to enable adaptive strategies.<n>We demonstrate its strong performance against state-of-the-art MARL baselines across both symmetric and asymmetric scenarios.
arXiv Detail & Related papers (2025-02-14T13:23:18Z) - ConML: A Universal Meta-Learning Framework with Task-Level Contrastive Learning [49.447777286862994]
ConML is a universal meta-learning framework that can be applied to various meta-learning algorithms.
We demonstrate that ConML integrates seamlessly with optimization-based, metric-based, and amortization-based meta-learning algorithms.
arXiv Detail & Related papers (2024-10-08T12:22:10Z) - LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments [70.91258869156353]
We introduce LangSuitE, a versatile and simulation-free testbed featuring 6 representative embodied tasks in textual embodied worlds.
Compared with previous LLM-based testbeds, LangSuitE offers adaptability to diverse environments without multiple simulation engines.
We devise a novel chain-of-thought (CoT) schema, EmMem, which summarizes embodied states w.r.t. history information.
arXiv Detail & Related papers (2024-06-24T03:36:29Z) - Verco: Learning Coordinated Verbal Communication for Multi-agent Reinforcement Learning [42.27106057372819]
We propose a novel multi-agent reinforcement learning algorithm that embeds large language models into agents.
The framework has a message module and an action module.
Experiments conducted on the Overcooked game demonstrate our method significantly enhances the learning efficiency and performance of existing methods.
arXiv Detail & Related papers (2024-04-27T05:10:33Z) - Teaching Machines to Code: Smart Contract Translation with LLMs [4.780973517287942]
We present a pioneering approach, which harnesses the synergy of two distinct large language models (LLMs) within a unified framework.
This framework is designed to grasp coding principles and apply this understanding to the translation of code into an unfamiliar language.
Our study delves into the capacity of LLMs to mimic human learning processes, offering an in-depth evaluation of our methodology for converting smart contracts written in Solidity to Move.
arXiv Detail & Related papers (2024-03-13T18:55:20Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - LMRL Gym: Benchmarks for Multi-Turn Reinforcement Learning with Language
Models [56.25156596019168]
This paper introduces the LMRL-Gym benchmark for evaluating multi-turn RL for large language models (LLMs)
Our benchmark consists of 8 different language tasks, which require multiple rounds of language interaction and cover a range of tasks in open-ended dialogue and text games.
arXiv Detail & Related papers (2023-11-30T03:59:31Z) - MAgIC: Investigation of Large Language Model Powered Multi-Agent in Cognition, Adaptability, Rationality and Collaboration [98.18244218156492]
Large Language Models (LLMs) have significantly advanced natural language processing.<n>As their applications expand into multi-agent environments, there arises a need for a comprehensive evaluation framework.<n>This work introduces a novel competition-based benchmark framework to assess LLMs within multi-agent settings.
arXiv Detail & Related papers (2023-11-14T21:46:27Z) - LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models [23.092480882456048]
Large Language Models (LLMs) have demonstrated emergent common-sense reasoning and Theory of Mind (ToM) capabilities.<n>This study introduces the LLM-Coordination Benchmark, a novel benchmark for analyzing LLMs in the context of Pure Coordination Settings.
arXiv Detail & Related papers (2023-10-05T21:18:15Z) - Building Cooperative Embodied Agents Modularly with Large Language
Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework.
Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.