Automatic Curriculum Design for Zero-Shot Human-AI Coordination
- URL: http://arxiv.org/abs/2503.07275v2
- Date: Thu, 21 Aug 2025 11:19:50 GMT
- Title: Automatic Curriculum Design for Zero-Shot Human-AI Coordination
- Authors: Won-Sang You, Tae-Gwan Ha, Seo-Young Lee, Kyung-Joong Kim,
- Abstract summary: Zero-shot human-AI coordination is the training of an ego-agent to coordinate with humans without human data.<n>We propose a utility function and co-player sampling for a zero-shot human-AI coordination setting.<n>Our method achieves high performance in human-AI coordination tasks in unseen environments.
- Score: 4.634917646296438
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-shot human-AI coordination is the training of an ego-agent to coordinate with humans without human data. Most studies on zero-shot human-AI coordination have focused on enhancing the ego-agent's coordination ability in a given environment without considering the issue of generalization to unseen environments. Real-world applications of zero-shot human-AI coordination should consider unpredictable environmental changes and the varying coordination ability of co-players depending on the environment. Previously, the multi-agent UED (Unsupervised Environment Design) approach has investigated these challenges by jointly considering environmental changes and co-player policy in competitive two-player AI-AI scenarios. In this paper, our study extends a multi-agent UED approach to zero-shot human-AI coordination. We propose a utility function and co-player sampling for a zero-shot human-AI coordination setting that helps train the ego-agent to coordinate with humans more effectively than a previous multi-agent UED approach. The zero-shot human-AI coordination performance was evaluated in the Overcooked-AI environment, using human proxy agents and real humans. Our method outperforms other baseline models and achieves high performance in human-AI coordination tasks in unseen environments. The source code is available at https://github.com/Uwonsang/ACD_Human-AI
Related papers
- Nested Training for Mutual Adaptation in Human-AI Teaming [30.247046563601202]
Existing approaches aim to improve diversity in training partners to approximate human behavior, but these partners are static and fail to capture adaptive behavior of humans.<n>We model the human-robot teaming scenario as an Interactive Partially Observable Markov Decision Process (I-POMDP), explicitly modeling human adaptation as part of the state.<n>We train our method in a multi-episode, required cooperation setup in the Overcooked domain, comparing it against several baseline agents designed for human-robot teaming.
arXiv Detail & Related papers (2026-02-18T23:07:48Z) - Autonomous Continual Learning of Computer-Use Agents for Environment Adaptation [57.65688895630163]
We introduce ACuRL, an Autonomous Curriculum Reinforcement Learning framework that continually adapts agents to specific environments with zero human data.<n>Our method effectively enables both intra-environment and cross-environment continual learning, yielding 4-22% performance gains without forgetting existing environments.
arXiv Detail & Related papers (2026-02-10T23:06:02Z) - Improving Human-AI Coordination through Adversarial Training and Generative Models [36.54154192505703]
Generalizing to novel humans requires training on data that captures the diversity of human behaviors.
Adversarial training is one avenue for searching for such data and ensuring that agents are robust.
We propose a novel strategy for overcoming self-sabotage that combines a pre-trained generative model to simulate valid cooperative agent policies.
arXiv Detail & Related papers (2025-04-21T21:53:00Z) - Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration [51.452664740963066]
Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments.
We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions.
Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
arXiv Detail & Related papers (2024-12-20T09:21:15Z) - Mixed-Initiative Human-Robot Teaming under Suboptimality with Online Bayesian Adaptation [0.6591036379613505]
We develop computational modeling and optimization techniques for enhancing the performance of suboptimal human-agent teams.
We adopt an online Bayesian approach that enables a robot to infer people's willingness to comply with its assistance in a sequential decision-making game.
Our user studies show that user preferences and team performance indeed vary with robot intervention styles.
arXiv Detail & Related papers (2024-03-24T14:38:18Z) - Human-AI Collaboration in Real-World Complex Environment with
Reinforcement Learning [8.465957423148657]
We show that learning from humans is effective and that human-AI collaboration outperforms human-controlled and fully autonomous AI agents.
We develop a user interface to allow humans to assist AI agents effectively.
arXiv Detail & Related papers (2023-12-23T04:27:24Z) - Efficient Human-AI Coordination via Preparatory Language-based
Convention [17.840956842806975]
Existing methods for human-AI coordination typically train an agent to coordinate with a diverse set of policies or with human models fitted from real human data.
We propose employing the large language model (LLM) to develop an action plan that effectively guides both human and AI.
Our method achieves better alignment with human preferences and an average performance improvement of 15% compared to the state-of-the-art.
arXiv Detail & Related papers (2023-11-01T10:18:23Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning.
COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z) - UniHCP: A Unified Model for Human-Centric Perceptions [75.38263862084641]
We propose a Unified Model for Human-Centric Perceptions (UniHCP)
UniHCP unifies a wide range of human-centric tasks in a simplified end-to-end manner with the plain vision transformer architecture.
With large-scale joint training on 33 human-centric datasets, UniHCP can outperform strong baselines by direct evaluation.
arXiv Detail & Related papers (2023-03-06T07:10:07Z) - PECAN: Leveraging Policy Ensemble for Context-Aware Zero-Shot Human-AI
Coordination [52.991211077362586]
We propose a policy ensemble method to increase the diversity of partners in the population.
We then develop a context-aware method enabling the ego agent to analyze and identify the partner's potential policy primitives.
In this way, the ego agent is able to learn more universal cooperative behaviors for collaborating with diverse partners.
arXiv Detail & Related papers (2023-01-16T12:14:58Z) - Stateful active facilitator: Coordination and Environmental
Heterogeneity in Cooperative Multi-Agent Reinforcement Learning [71.53769213321202]
We formalize the notions of coordination level and heterogeneity level of an environment.
We present HECOGrid, a suite of multi-agent environments that facilitates empirical evaluation of different MARL approaches.
We propose a Training Decentralized Execution learning approach that enables agents to work efficiently in high-coordination and high-heterogeneity environments.
arXiv Detail & Related papers (2022-10-04T18:17:01Z) - On some Foundational Aspects of Human-Centered Artificial Intelligence [52.03866242565846]
There is no clear definition of what is meant by Human Centered Artificial Intelligence.
This paper introduces the term HCAI agent to refer to any physical or software computational agent equipped with AI components.
We see the notion of HCAI agent, together with its components and functions, as a way to bridge the technical and non-technical discussions on human-centered AI.
arXiv Detail & Related papers (2021-12-29T09:58:59Z) - On the Importance of Environments in Human-Robot Coordination [17.60947307552083]
We propose a framework for procedural generation of environments that result in diverse behaviors.
Results show that the environments result in qualitatively different emerging behaviors and statistically significant differences in collaborative metrics.
arXiv Detail & Related papers (2021-06-21T04:39:55Z) - Weak Human Preference Supervision For Deep Reinforcement Learning [48.03929962249475]
The current reward learning from human preferences could be used to resolve complex reinforcement learning (RL) tasks without access to a reward function.
We propose a weak human preference supervision framework, for which we developed a human preference scaling model.
Our established human-demonstration estimator requires human feedback only for less than 0.01% of the agent's interactions with the environment.
arXiv Detail & Related papers (2020-07-25T10:37:15Z) - Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork [54.309495231017344]
We argue that AI systems should be trained in a human-centered manner, directly optimized for team performance.
We study this proposal for a specific type of human-AI teaming, where the human overseer chooses to either accept the AI recommendation or solve the task themselves.
Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the most accuracy AI may not lead to highest team performance.
arXiv Detail & Related papers (2020-04-27T19:06:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.