Related papers: CooT: Learning to Coordinate In-Context with Coordination Transformers

CooT: Learning to Coordinate In-Context with Coordination Transformers

URL: http://arxiv.org/abs/2506.23549v2
Date: Sat, 18 Oct 2025 07:51:43 GMT
Title: CooT: Learning to Coordinate In-Context with Coordination Transformers
Authors: Huai-Chih Wang, Hsiang-Chun Chuang, Hsi-Chun Cheng, Dai-Jie Wu, Shao-Hua Sun,
Abstract summary: Coordination Transformers (coot) is a novel in-context coordination framework that rapidly adapts to unseen partners.<n>coot consistently outperforms baselines including population-based approaches, gradient-based fine-tuning, and a Meta-RL-inspired contextual adaptation method.<n>By contrast, coot achieves stable, rapid in-context adaptation and is consistently ranked the most effective collaborator in human evaluations.
Score: 10.888155149916967
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Effective coordination among artificial agents in dynamic and uncertain environments remains a significant challenge in multi-agent systems. Existing approaches, such as self-play and population-based methods, either generalize poorly to unseen partners or require impractically extensive fine-tuning. To overcome these limitations, we propose Coordination Transformers (\coot), a novel in-context coordination framework that uses recent interaction histories to rapidly adapt to unseen partners. Unlike prior approaches that primarily aim to diversify training partners, \coot explicitly focuses on adapting to new partner behaviors by predicting actions aligned with observed interactions. Trained on trajectories collected from diverse pairs of agents with complementary preferences, \coot quickly learns effective coordination strategies without explicit supervision or parameter updates. Across diverse coordination tasks in Overcooked, \coot consistently outperforms baselines including population-based approaches, gradient-based fine-tuning, and a Meta-RL-inspired contextual adaptation method. Notably, fine-tuning proves unstable and ineffective, while Meta-RL struggles to achieve reliable coordination. By contrast, \coot achieves stable, rapid in-context adaptation and is consistently ranked the most effective collaborator in human evaluations.

Related papers

Nested Training for Mutual Adaptation in Human-AI Teaming [30.247046563601202]
Existing approaches aim to improve diversity in training partners to approximate human behavior, but these partners are static and fail to capture adaptive behavior of humans.<n>We model the human-robot teaming scenario as an Interactive Partially Observable Markov Decision Process (I-POMDP), explicitly modeling human adaptation as part of the state.<n>We train our method in a multi-episode, required cooperation setup in the Overcooked domain, comparing it against several baseline agents designed for human-robot teaming.
arXiv Detail & Related papers (2026-02-18T23:07:48Z)
Adaptively Coordinating with Novel Partners via Learned Latent Strategies [19.014669675808133]
We introduce a strategy-conditioned cooperator framework that learns to represent, categorize, and adapt to a broad range of potential partner strategies in real-time.<n>Our approach encodes strategies with a variational autoencoder to learn a latent strategy space from agent trajectory data.<n>We leverage a fixed-share regret minimization algorithm that dynamically infers and adjusts the partner's strategy estimation during interaction.
arXiv Detail & Related papers (2025-11-16T19:45:35Z)
Deep Reinforcement Learning for Multi-Agent Coordination [8.250169938213558]
We propose a Stigmergic Multi-Agent Deep Reinforcement Learning (S-MADRL) framework that leverages virtual pheromones to model local and social interactions.<n>We show that our framework achieves the most effective coordination of up to eight agents, where robots self-organize into asymmetric workload distributions.<n>This emergent behavior, analogous to strategies observed in nature, demonstrates a scalable solution for decentralized multi-agent coordination in crowded environments.
arXiv Detail & Related papers (2025-10-04T00:47:20Z)
Decentralized Dynamic Cooperation of Personalized Models for Federated Continual Learning [50.56947843548702]
We propose a decentralized dynamic cooperation framework for Federated continual learning.<n>Clients establish dynamic cooperative learning coalitions to balance the acquisition of new knowledge and the retention of prior learning.<n>We also propose a merge-blocking algorithm and a dynamic cooperative evolution algorithm to achieve cooperative and dynamic equilibrium.
arXiv Detail & Related papers (2025-09-28T06:53:23Z)
Enhancing Multi-Agent Collaboration with Attention-Based Actor-Critic Policies [0.0]
Team-Attention-Actor-Critic (TAAC) is a learning algorithm designed to enhance multi-agent collaboration in cooperative environments.<n>We evaluate TAAC in a simulated soccer environment against benchmark algorithms.
arXiv Detail & Related papers (2025-07-30T15:48:38Z)
Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination [37.90912492084769]
We study how reinforcement learning on a distribution of environments with a single partner enables learning general cooperative skills.<n>We introduce two Jax-based, procedural generators that create billions of solvable coordination challenges.<n>Our findings suggest that learning to collaborate across many unique scenarios encourages agents to develop general norms.
arXiv Detail & Related papers (2025-04-17T07:41:25Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
Adaptive Coordination in Social Embodied Rearrangement [49.35582108902819]
We study zero-shot coordination (ZSC) in this task, where an agent collaborates with a new partner, emulating a scenario where a robot collaborates with a new human partner. We propose Behavior Diversity Play (BDP), a novel ZSC approach that encourages diversity through a discriminability objective. Our results demonstrate that BDP learns adaptive agents that can tackle visual coordination, and zero-shot generalize to new partners in unseen environments, achieving 35% higher success and 32% higher efficiency compared to baselines.
arXiv Detail & Related papers (2023-05-31T18:05:51Z)
PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI Coordination [52.991211077362586]
We propose a policy ensemble method to increase the diversity of partners in the population. We then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives. In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners.
arXiv Detail & Related papers (2023-01-16T12:14:58Z)
Rethinking Trajectory Prediction via "Team Game" [118.59480535826094]
We present a novel formulation for multi-agent trajectory prediction, which explicitly introduces the concept of interactive group consensus. On two multi-agent settings, i.e. team sports and pedestrians, the proposed framework consistently achieves superior performance compared to existing methods.
arXiv Detail & Related papers (2022-10-17T07:16:44Z)
Stateful active facilitator: Coordination and Environmental Heterogeneity in Cooperative Multi-Agent Reinforcement Learning [71.53769213321202]
We formalize the notions of coordination level and heterogeneity level of an environment. We present HECOGrid, a suite of multi-agent environments that facilitates empirical evaluation of different MARL approaches. We propose a Training Decentralized Execution learning approach that enables agents to work efficiently in high-coordination and high-heterogeneity environments.
arXiv Detail & Related papers (2022-10-04T18:17:01Z)
Cooperative guidance of multiple missiles: a hybrid co-evolutionary approach [0.9176056742068814]
Cooperative guidance of multiple missiles is a challenging task with rigorous constraints of time and space consensus. This paper develops a novel natural co-evolutionary strategy (NCES) to address the issues of non-stationarity and continuous control faced by cooperative guidance. A hybrid co-evolutionary cooperative guidance law (HCCGL) is proposed by integrating the highly scalable co-evolutionary mechanism and the traditional guidance strategy.
arXiv Detail & Related papers (2022-08-15T12:59:38Z)
Depthwise Convolution for Multi-Agent Communication with Enhanced Mean-Field Approximation [9.854975702211165]
We propose a new method based on local communication learning to tackle the multi-agent RL (MARL) challenge. First, we design a new communication protocol that exploits the ability of depthwise convolution to efficiently extract local relations. Second, we introduce the mean-field approximation into our method to reduce the scale of agent interactions.
arXiv Detail & Related papers (2022-03-06T07:42:43Z)
Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time. We propose a novel approach to address the difficulties of scalability and data scarcity. Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z)
Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints. We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z)
DSDF: An approach to handle stochastic agents in collaborative multi-agent reinforcement learning [0.0]
We show how thisity of agents, which could be a result of malfunction or aging of robots, can add to the uncertainty in coordination. Our solution, DSDF which tunes the discounted factor for the agents according to uncertainty and use the values to update the utility networks of individual agents.
arXiv Detail & Related papers (2021-09-14T12:02:28Z)
On the Critical Role of Conventions in Adaptive Human-AI Collaboration [73.21967490610142]
We propose a learning framework that teases apart rule-dependent representation from convention-dependent representation. We experimentally validate our approach on three collaborative tasks varying in complexity.
arXiv Detail & Related papers (2021-04-07T02:46:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.