Adaptively Coordinating with Novel Partners via Learned Latent Strategies
- URL: http://arxiv.org/abs/2511.12754v1
- Date: Sun, 16 Nov 2025 19:45:35 GMT
- Title: Adaptively Coordinating with Novel Partners via Learned Latent Strategies
- Authors: Benjamin Li, Shuyang Shi, Lucia Romero, Huao Li, Yaqi Xie, Woojun Kim, Stefanos Nikolaidis, Michael Lewis, Katia Sycara, Simon Stepputtis,
- Abstract summary: We introduce a strategy-conditioned cooperator framework that learns to represent, categorize, and adapt to a broad range of potential partner strategies in real-time.<n>Our approach encodes strategies with a variational autoencoder to learn a latent strategy space from agent trajectory data.<n>We leverage a fixed-share regret minimization algorithm that dynamically infers and adjusts the partner's strategy estimation during interaction.
- Score: 19.014669675808133
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adaptation is the cornerstone of effective collaboration among heterogeneous team members. In human-agent teams, artificial agents need to adapt to their human partners in real time, as individuals often have unique preferences and policies that may change dynamically throughout interactions. This becomes particularly challenging in tasks with time pressure and complex strategic spaces, where identifying partner behaviors and selecting suitable responses is difficult. In this work, we introduce a strategy-conditioned cooperator framework that learns to represent, categorize, and adapt to a broad range of potential partner strategies in real-time. Our approach encodes strategies with a variational autoencoder to learn a latent strategy space from agent trajectory data, identifies distinct strategy types through clustering, and trains a cooperator agent conditioned on these clusters by generating partners of each strategy type. For online adaptation to novel partners, we leverage a fixed-share regret minimization algorithm that dynamically infers and adjusts the partner's strategy estimation during interaction. We evaluate our method in a modified version of the Overcooked domain, a complex collaborative cooking environment that requires effective coordination among two players with a diverse potential strategy space. Through these experiments and an online user study, we demonstrate that our proposed agent achieves state of the art performance compared to existing baselines when paired with novel human, and agent teammates.
Related papers
- Training Generalizable Collaborative Agents via Strategic Risk Aversion [14.968945672756854]
We study the concept of strategic risk aversion and interpret it as a principled inductive bias for generalizable cooperation with unseen partners.<n>We develop a multi-agent reinforcement learning (MARL) algorithm that integrates strategic risk aversion into standard policy optimization methods.
arXiv Detail & Related papers (2026-02-25T03:06:59Z) - Modeling Latent Partner Strategies for Adaptive Zero-Shot Human-Agent Collaboration [8.829018359718086]
We introduce TALENTS, a strategy-conditioned cooperator framework that learns to represent, categorize, and adapt to a range of partner strategies.<n>Our approach utilizes a variational autoencoder to learn a latent strategy space from trajectory data.<n>In order to adapt to previously unseen partners, we leverage a fixed-share regret minimization algorithm.
arXiv Detail & Related papers (2025-07-07T17:53:13Z) - CooT: Learning to Coordinate In-Context with Coordination Transformers [10.888155149916967]
Coordination Transformers (coot) is a novel in-context coordination framework that rapidly adapts to unseen partners.<n>coot consistently outperforms baselines including population-based approaches, gradient-based fine-tuning, and a Meta-RL-inspired contextual adaptation method.<n>By contrast, coot achieves stable, rapid in-context adaptation and is consistently ranked the most effective collaborator in human evaluations.
arXiv Detail & Related papers (2025-06-30T06:45:39Z) - Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning [51.52387511006586]
We propose Hierarchical Opponent modeling and Planning (HOP), a novel multi-agent decision-making algorithm.
HOP is hierarchically composed of two modules: an opponent modeling module that infers others' goals and learns corresponding goal-conditioned policies.
HOP exhibits superior few-shot adaptation capabilities when interacting with various unseen agents, and excels in self-play scenarios.
arXiv Detail & Related papers (2024-06-12T08:48:06Z) - Learning Multi-Agent Communication from Graph Modeling Perspective [62.13508281188895]
We introduce a novel approach wherein we conceptualize the communication architecture among agents as a learnable graph.
Our proposed approach, CommFormer, efficiently optimize the communication graph and concurrently refines architectural parameters through gradient descent in an end-to-end manner.
arXiv Detail & Related papers (2024-05-14T12:40:25Z) - Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning.
COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z) - PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI
Coordination [52.991211077362586]
We propose a policy ensemble method to increase the diversity of partners in the population.
We then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives.
In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners.
arXiv Detail & Related papers (2023-01-16T12:14:58Z) - Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time.
We propose a novel approach to address the difficulties of scalability and data scarcity.
Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z) - Distributed Adaptive Learning Under Communication Constraints [54.22472738551687]
This work examines adaptive distributed learning strategies designed to operate under communication constraints.
We consider a network of agents that must solve an online optimization problem from continual observation of streaming data.
arXiv Detail & Related papers (2021-12-03T19:23:48Z) - On the Critical Role of Conventions in Adaptive Human-AI Collaboration [73.21967490610142]
We propose a learning framework that teases apart rule-dependent representation from convention-dependent representation.
We experimentally validate our approach on three collaborative tasks varying in complexity.
arXiv Detail & Related papers (2021-04-07T02:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.