Symmetry-Breaking Augmentations for Ad Hoc Teamwork
- URL: http://arxiv.org/abs/2402.09984v1
- Date: Thu, 15 Feb 2024 14:49:28 GMT
- Title: Symmetry-Breaking Augmentations for Ad Hoc Teamwork
- Authors: Ravi Hammond, Dustin Craggs, Mingyu Guo, Jakob Foerster, Ian Reid
- Abstract summary: In many collaborative settings, artificial intelligence (AI) agents must be able to adapt to new teammates that use unknown or previously unobserved strategies.
We introduce symmetry-breaking augmentations (SBA), which increases diversity in the behaviour of training teammates by applying a symmetry-flipping operation.
We demonstrate this experimentally in two settings, and show that our approach improves upon previous ad hoc teamwork results in the challenging card game Hanabi.
- Score: 10.014956508924842
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In many collaborative settings, artificial intelligence (AI) agents must be
able to adapt to new teammates that use unknown or previously unobserved
strategies. While often simple for humans, this can be challenging for AI
agents. For example, if an AI agent learns to drive alongside others (a
training set) that only drive on one side of the road, it may struggle to adapt
this experience to coordinate with drivers on the opposite side, even if their
behaviours are simply flipped along the left-right symmetry. To address this we
introduce symmetry-breaking augmentations (SBA), which increases diversity in
the behaviour of training teammates by applying a symmetry-flipping operation.
By learning a best-response to the augmented set of teammates, our agent is
exposed to a wider range of behavioural conventions, improving performance when
deployed with novel teammates. We demonstrate this experimentally in two
settings, and show that our approach improves upon previous ad hoc teamwork
results in the challenging card game Hanabi. We also propose a general metric
for estimating symmetry-dependency amongst a given set of policies.
Related papers
- Toward Optimal LLM Alignments Using Two-Player Games [86.39338084862324]
In this paper, we investigate alignment through the lens of two-agent games, involving iterative interactions between an adversarial and a defensive agent.
We theoretically demonstrate that this iterative reinforcement learning optimization converges to a Nash Equilibrium for the game induced by the agents.
Experimental results in safety scenarios demonstrate that learning in such a competitive environment not only fully trains agents but also leads to policies with enhanced generalization capabilities for both adversarial and defensive agents.
arXiv Detail & Related papers (2024-06-16T15:24:50Z) - Optimizing Risk-averse Human-AI Hybrid Teams [1.433758865948252]
We propose a manager which learns, through a standard Reinforcement Learning scheme, how to best delegate.
We demonstrate the optimality of our manager's performance in several grid environments.
Our results show our manager can successfully learn desirable delegations which result in team paths near/exactly optimal.
arXiv Detail & Related papers (2024-03-13T09:49:26Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Mastering Asymmetrical Multiplayer Game with Multi-Agent
Asymmetric-Evolution Reinforcement Learning [8.628547849796615]
Asymmetrical multiplayer (AMP) game is a popular game genre which involves multiple types of agents competing or collaborating in the game.
It is difficult to train powerful agents that can defeat top human players in AMP games by typical self-play training method because of unbalancing characteristics in their asymmetrical environments.
We propose asymmetric-evolution training (AET), a novel multi-agent reinforcement learning framework that can train multiple kinds of agents simultaneously in AMP game.
arXiv Detail & Related papers (2023-04-20T07:14:32Z) - On-the-fly Strategy Adaptation for ad-hoc Agent Coordination [21.029009561094725]
Training agents in cooperative settings offers the promise of AI agents able to interact effectively with humans (and other agents) in the real world.
The vast majority of focus has been on the self-play paradigm.
This paper proposes to solve this problem by adapting agent strategies on the fly, using a posterior belief over the other agents' strategy.
arXiv Detail & Related papers (2022-03-08T02:18:11Z) - Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time.
We propose a novel approach to address the difficulties of scalability and data scarcity.
Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Multi-Agent Collaboration via Reward Attribution Decomposition [75.36911959491228]
We propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge.
CollaQ is evaluated on various StarCraft Attribution maps and shows that it outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-10-16T17:42:11Z) - Moody Learners -- Explaining Competitive Behaviour of Reinforcement
Learning Agents [65.2200847818153]
In a competitive scenario, the agent does not only have a dynamic environment but also is directly affected by the opponents' actions.
Observing the Q-values of the agent is usually a way of explaining its behavior, however, do not show the temporal-relation between the selected actions.
arXiv Detail & Related papers (2020-07-30T11:30:42Z) - "Other-Play" for Zero-Shot Coordination [21.607428852157273]
Other-play learning algorithm enhances self-play by looking for more robust strategies.
We study the cooperative card game Hanabi and show that OP agents achieve higher scores when paired with independently trained agents.
arXiv Detail & Related papers (2020-03-06T00:39:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.