Expected Value of Communication for Planning in Ad Hoc Teamwork
- URL: http://arxiv.org/abs/2103.01171v1
- Date: Mon, 1 Mar 2021 18:09:36 GMT
- Title: Expected Value of Communication for Planning in Ad Hoc Teamwork
- Authors: William Macke, Reuth Mirsky and Peter Stone
- Abstract summary: A desirable goal for autonomous agents is to be able to coordinate on the fly with previously unknown teammates.
One of the central challenges in ad hoc teamwork is quickly recognizing the current plans of other agents and planning accordingly.
We present a novel planning algorithm for ad hoc teamwork, determining which query to ask and planning accordingly.
- Score: 44.262891197318034
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A desirable goal for autonomous agents is to be able to coordinate on the fly
with previously unknown teammates. Known as "ad hoc teamwork", enabling such a
capability has been receiving increasing attention in the research community.
One of the central challenges in ad hoc teamwork is quickly recognizing the
current plans of other agents and planning accordingly. In this paper, we focus
on the scenario in which teammates can communicate with one another, but only
at a cost. Thus, they must carefully balance plan recognition based on
observations vs. that based on communication. This paper proposes a new metric
for evaluating how similar are two policies that a teammate may be following -
the Expected Divergence Point (EDP). We then present a novel planning algorithm
for ad hoc teamwork, determining which query to ask and planning accordingly.
We demonstrate the effectiveness of this algorithm in a range of increasingly
general communication in ad hoc teamwork problems.
Related papers
- GOMA: Proactive Embodied Cooperative Communication via Goal-Oriented Mental Alignment [72.96949760114575]
We propose a novel cooperative communication framework, Goal-Oriented Mental Alignment (GOMA)
GOMA formulates verbal communication as a planning problem that minimizes the misalignment between parts of agents' mental states that are relevant to the goals.
We evaluate our approach against strong baselines in two challenging environments, Overcooked (a multiplayer game) and VirtualHome (a household simulator)
arXiv Detail & Related papers (2024-03-17T03:52:52Z) - Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in
the Avalon Game [25.823665278297057]
This study focuses on the ad hoc teamwork problem where the agent operates in an environment driven by natural language.
Our findings reveal the potential of LLM agents in team collaboration, highlighting issues related to hallucinations in communication.
To address this issue, we develop CodeAct, a general agent that equips LLM with enhanced memory and code-driven reasoning.
arXiv Detail & Related papers (2023-12-29T08:26:54Z) - Adaptation and Communication in Human-Robot Teaming to Handle
Discrepancies in Agents' Beliefs about Plans [13.637799815698559]
We provide an online execution algorithm based on Monte Carlo Tree Search for the agent to plan its action.
We show that our agent is better equipped to work in teams without the guarantee of a shared mental model.
arXiv Detail & Related papers (2023-07-07T03:05:34Z) - Inferring the Goals of Communicating Agents from Actions and
Instructions [47.5816320484482]
We introduce a model of a cooperative team where one agent, the principal, may communicate natural language instructions about their shared plan to another agent, the assistant.
We show how a third person observer can infer the team's goal via multi-modal inverse planning from actions and instructions.
We evaluate this approach by comparing it with human goal inferences in a multi-agent gridworld, finding that our model's inferences closely correlate with human judgments.
arXiv Detail & Related papers (2023-06-28T13:43:46Z) - Cooperative Actor-Critic via TD Error Aggregation [12.211031907519827]
We introduce a decentralized actor-critic algorithm with TD error aggregation that does not violate privacy issues.
We provide a convergence analysis under diminishing step size to verify that the agents maximize the team-average objective function.
arXiv Detail & Related papers (2022-07-25T21:10:39Z) - A Survey of Ad Hoc Teamwork: Definitions, Methods, and Open Problems [38.995115886327696]
Ad hoc teamwork is the well-established research problem of designing agents that can collaborate with new teammates without prior coordination.
This survey makes a two-fold contribution. First, it provides a structured description of the different facets of the ad hoc teamwork problem.
Second, it discusses the progress that has been made in the field so far, and identifies the immediate and long-term open problems that need to be addressed in the field of ad hoc teamwork.
arXiv Detail & Related papers (2022-02-16T18:16:27Z) - Assisting Unknown Teammates in Unknown Tasks: Ad Hoc Teamwork under
Partial Observability [15.995282665634097]
We present a novel online prediction algorithm for the problem setting of ad hoc teamwork under partial observability (ATPO)
ATPO accommodates partial observability, using the agent's observations to identify which task is being performed by the teammates.
Our results show that ATPO is effective and robust in identifying the teammate's task from a large library of possible tasks, efficient at solving it in near-optimal time, and scalable in adapting to increasingly larger problem sizes.
arXiv Detail & Related papers (2022-01-10T18:53:34Z) - Interpretation of Emergent Communication in Heterogeneous Collaborative
Embodied Agents [83.52684405389445]
We introduce the collaborative multi-object navigation task CoMON.
In this task, an oracle agent has detailed environment information in the form of a map.
It communicates with a navigator agent that perceives the environment visually and is tasked to find a sequence of goals.
We show that the emergent communication can be grounded to the agent observations and the spatial structure of the 3D environment.
arXiv Detail & Related papers (2021-10-12T06:56:11Z) - Quasi-Equivalence Discovery for Zero-Shot Emergent Communication [63.175848843466845]
We present a novel problem setting and the Quasi-Equivalence Discovery algorithm that allows for zero-shot coordination (ZSC)
We show that these two factors lead to unique optimal ZSC policies in referential games.
QED can iteratively discover the symmetries in this setting and converges to the optimal ZSC policy.
arXiv Detail & Related papers (2021-03-14T23:42:37Z) - Model-based Reinforcement Learning for Decentralized Multiagent
Rendezvous [66.6895109554163]
Underlying the human ability to align goals with other agents is their ability to predict the intentions of others and actively update their own plans.
We propose hierarchical predictive planning (HPP), a model-based reinforcement learning method for decentralized multiagent rendezvous.
arXiv Detail & Related papers (2020-03-15T19:49:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.