Reliably Re-Acting to Partner's Actions with the Social Intrinsic
Motivation of Transfer Empowerment
- URL: http://arxiv.org/abs/2203.03355v1
- Date: Mon, 7 Mar 2022 13:03:35 GMT
- Title: Reliably Re-Acting to Partner's Actions with the Social Intrinsic
Motivation of Transfer Empowerment
- Authors: Tessa van der Heiden, Herke van Hoof, Efstratios Gavves, Christoph
Salge
- Abstract summary: We consider multi-agent reinforcement learning (MARL) for cooperative communication and coordination tasks.
MARL agents can be brittle because they can overfit their training partners' policies.
Our objective is to bias the learning process towards finding reactive strategies towards other agents' behaviors.
- Score: 40.24079015603578
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider multi-agent reinforcement learning (MARL) for cooperative
communication and coordination tasks. MARL agents can be brittle because they
can overfit their training partners' policies. This overfitting can produce
agents that adopt policies that act under the expectation that other agents
will act in a certain way rather than react to their actions. Our objective is
to bias the learning process towards finding reactive strategies towards other
agents' behaviors. Our method, transfer empowerment, measures the potential
influence between agents' actions. Results from three simulated cooperation
scenarios support our hypothesis that transfer empowerment improves MARL
performance. We discuss how transfer empowerment could be a useful principle to
guide multi-agent coordination by ensuring reactiveness to one's partner.
Related papers
- Situation-Dependent Causal Influence-Based Cooperative Multi-agent
Reinforcement Learning [18.054709749075194]
We propose a novel MARL algorithm named Situation-Dependent Causal Influence-Based Cooperative Multi-agent Reinforcement Learning (SCIC)
Our approach aims to detect inter-agent causal influences in specific situations based on the criterion using causal intervention and conditional mutual information.
The resulting update links coordinated exploration and intrinsic reward distribution, which enhance overall collaboration and performance.
arXiv Detail & Related papers (2023-12-15T05:09:32Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - Learning to Participate through Trading of Reward Shares [1.5484595752241124]
We propose a method inspired by the stock market, where agents have the opportunity to participate in other agents' returns by acquiring reward shares.
Intuitively, an agent may learn to act according to the common interest when being directly affected by the other agents' rewards.
arXiv Detail & Related papers (2023-01-18T10:25:55Z) - Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent
RL [107.58821842920393]
We quantify the agent's behavior difference and build its relationship with the policy performance via bf Role Diversity
We find that the error bound in MARL can be decomposed into three parts that have a strong relation to the role diversity.
The decomposed factors can significantly impact policy optimization on three popular directions.
arXiv Detail & Related papers (2022-06-01T04:58:52Z) - Coordinating Policies Among Multiple Agents via an Intelligent
Communication Channel [81.39444892747512]
In Multi-Agent Reinforcement Learning (MARL), specialized channels are often introduced that allow agents to communicate directly with one another.
We propose an alternative approach whereby agents communicate through an intelligent facilitator that learns to sift through and interpret signals provided by all agents to improve the agents' collective performance.
arXiv Detail & Related papers (2022-05-21T14:11:33Z) - Iterated Reasoning with Mutual Information in Cooperative and Byzantine
Decentralized Teaming [0.0]
We show that reformulating an agent's policy to be conditional on the policies of its teammates inherently maximizes Mutual Information (MI) lower-bound when optimizing under Policy Gradient (PG)
Our approach, InfoPG, outperforms baselines in learning emergent collaborative behaviors and sets the state-of-the-art in decentralized cooperative MARL tasks.
arXiv Detail & Related papers (2022-01-20T22:54:32Z) - Promoting Resilience in Multi-Agent Reinforcement Learning via
Confusion-Based Communication [5.367993194110255]
We highlight the relationship between a group's ability to collaborate effectively and the group's resilience.
To promote resilience, we suggest facilitating collaboration via a novel confusion-based communication protocol.
We present empirical evaluation of our approach in a variety of MARL settings.
arXiv Detail & Related papers (2021-11-12T09:03:19Z) - Cooperative and Competitive Biases for Multi-Agent Reinforcement
Learning [12.676356746752893]
Training a multi-agent reinforcement learning (MARL) algorithm is more challenging than training a single-agent reinforcement learning algorithm.
We propose an algorithm that boosts MARL training using the biased action information of other agents based on a friend-or-foe concept.
We empirically demonstrate that our algorithm outperforms existing algorithms in various mixed cooperative-competitive environments.
arXiv Detail & Related papers (2021-01-18T05:52:22Z) - Learning Latent Representations to Influence Multi-Agent Interaction [65.44092264843538]
We propose a reinforcement learning-based framework for learning latent representations of an agent's policy.
We show that our approach outperforms the alternatives and learns to influence the other agent.
arXiv Detail & Related papers (2020-11-12T19:04:26Z) - Learning to Incentivize Other Learning Agents [73.03133692589532]
We show how to equip RL agents with the ability to give rewards directly to other agents, using a learned incentive function.
Such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games.
Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.
arXiv Detail & Related papers (2020-06-10T20:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.