Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration
- URL: http://arxiv.org/abs/2412.15701v2
- Date: Thu, 16 Jan 2025 07:01:37 GMT
- Title: Collaborative Gym: A Framework for Enabling and Evaluating Human-Agent Collaboration
- Authors: Yijia Shao, Vinay Samuel, Yucheng Jiang, John Yang, Diyi Yang,
- Abstract summary: Collaborative Gym is a framework enabling asynchronous, tripartite interaction among agents, humans, and task environments.<n>We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions.<n>Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance.
- Score: 51.452664740963066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advancements in language models (LMs) have sparked growing interest in developing LM agents. While fully autonomous agents could excel in many scenarios, numerous use cases inherently require them to collaborate with humans due to humans' latent preferences, domain expertise, or need for control. To facilitate the study of human-agent collaboration, we present Collaborative Gym (Co-Gym), a general framework enabling asynchronous, tripartite interaction among agents, humans, and task environments. We instantiate Co-Gym with three representative tasks in both simulated and real-world conditions, and propose an evaluation framework that assesses both the collaboration outcomes and processes. Our findings reveal that collaborative agents consistently outperform their fully autonomous counterparts in task performance within those delivered cases, achieving win rates of 86% in Travel Planning, 74% in Tabular Analysis, and 66% in Related Work when evaluated by real users. However, our study also highlights significant challenges in developing collaborative agents, requiring advancements in core aspects of intelligence -- communication capabilities, situational awareness, and balancing autonomy and human control.
Related papers
- Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination [37.90912492084769]
We study how reinforcement learning on a distribution of environments with a single partner enables learning general cooperative skills.
We introduce two Jax-based, procedural generators that create billions of solvable coordination challenges.
Our findings suggest that learning to collaborate across many unique scenarios encourages agents to develop general norms.
arXiv Detail & Related papers (2025-04-17T07:41:25Z) - Human-AI Collaboration: Trade-offs Between Performance and Preferences [5.172575113585139]
We show that agents who are more considerate of human actions are preferred over purely performance-maximizing agents.
We find evidence for inequality-aversion effects being a driver of human choices, suggesting that people prefer collaborative agents which allow them to meaningfully contribute to the team.
arXiv Detail & Related papers (2025-02-28T23:50:14Z) - Who is Helping Whom? Analyzing Inter-dependencies to Evaluate Cooperation in Human-AI Teaming [14.489157453882767]
We propose the concept of interdependence to measure how much agents rely on each other's actions to achieve the shared goal.
We pair state-of-the-art agents trained through MARL for HAT, with learned human models for the the popular Overcooked domain, and evaluate the team performance for these human-agent teams.
arXiv Detail & Related papers (2025-02-10T19:16:20Z) - Mutual Theory of Mind in Human-AI Collaboration: An Empirical Study with LLM-driven AI Agents in a Real-time Shared Workspace Task [56.92961847155029]
Theory of Mind (ToM) significantly impacts human collaboration and communication as a crucial capability to understand others.
Mutual Theory of Mind (MToM) arises when AI agents with ToM capability collaborate with humans.
We find that the agent's ToM capability does not significantly impact team performance but enhances human understanding of the agent.
arXiv Detail & Related papers (2024-09-13T13:19:48Z) - Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning [57.652899266553035]
Decentralized and lifelong-adaptive multi-agent collaborative learning aims to enhance collaboration among multiple agents without a central server.
We propose DeLAMA, a decentralized multi-agent lifelong collaborative learning algorithm with dynamic collaboration graphs.
arXiv Detail & Related papers (2024-03-11T09:21:11Z) - Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - MetaAgents: Simulating Interactions of Human Behaviors for LLM-based
Task-oriented Coordination via Collaborative Generative Agents [27.911816995891726]
We introduce collaborative generative agents, endowing LLM-based Agents with consistent behavior patterns and task-solving abilities.
We propose a novel framework that equips collaborative generative agents with human-like reasoning abilities and specialized skills.
Our work provides valuable insights into the role and evolution of Large Language Models in task-oriented social simulations.
arXiv Detail & Related papers (2023-10-10T10:17:58Z) - LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models [23.092480882456048]
This study aims at a detailed analysis of Large Language Models (LLMs) within the context of Pure Coordination Games.
Our findings indicate that LLM agents equipped with GPT-4-turbo achieve comparable performance to state-of-the-art reinforcement learning methods.
Results on Coordination QA show a large room for improvement in the Theory of Mind reasoning and joint planning abilities of LLMs.
arXiv Detail & Related papers (2023-10-05T21:18:15Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Building Cooperative Embodied Agents Modularly with Large Language
Models [104.57849816689559]
We address challenging multi-agent cooperation problems with decentralized control, raw sensory observations, costly communication, and multi-objective tasks instantiated in various embodied environments.
We harness the commonsense knowledge, reasoning ability, language comprehension, and text generation prowess of LLMs and seamlessly incorporate them into a cognitive-inspired modular framework.
Our experiments on C-WAH and TDW-MAT demonstrate that CoELA driven by GPT-4 can surpass strong planning-based methods and exhibit emergent effective communication.
arXiv Detail & Related papers (2023-07-05T17:59:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.