Related papers: ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward

URL: http://arxiv.org/abs/2210.04365v1
Date: Sun, 9 Oct 2022 22:24:44 GMT
Title: ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward
Authors: Zixian Ma, Rose Wang, Li Fei-Fei, Michael Bernstein, Ranjay Krishna
Abstract summary: We propose a self-supervised intrinsic reward ELIGN - expectation alignment. Similar to how animals collaborate in a decentralized manner with those in their vicinity, agents trained with expectation alignment learn behaviors that match their neighbors' expectations. We show that agent coordination improves through expectation alignment because agents learn to divide tasks amongst themselves, break coordination symmetries, and confuse adversaries.
Score: 29.737986509769808
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Modern multi-agent reinforcement learning frameworks rely on centralized training and reward shaping to perform well. However, centralized training and dense rewards are not readily available in the real world. Current multi-agent algorithms struggle to learn in the alternative setup of decentralized training or sparse rewards. To address these issues, we propose a self-supervised intrinsic reward ELIGN - expectation alignment - inspired by the self-organization principle in Zoology. Similar to how animals collaborate in a decentralized manner with those in their vicinity, agents trained with expectation alignment learn behaviors that match their neighbors' expectations. This allows the agents to learn collaborative behaviors without any external reward or centralized training. We demonstrate the efficacy of our approach across 6 tasks in the multi-agent particle and the complex Google Research football environments, comparing ELIGN to sparse and curiosity-based intrinsic rewards. When the number of agents increases, ELIGN scales well in all multi-agent tasks except for one where agents have different capabilities. We show that agent coordination improves through expectation alignment because agents learn to divide tasks amongst themselves, break coordination symmetries, and confuse adversaries. These results identify tasks where expectation alignment is a more useful strategy than curiosity-driven exploration for multi-agent coordination, enabling agents to do zero-shot coordination.

Related papers

Don't lie to your friends: Learning what you know from collaborative self-play [90.35507959579331]
We propose a radically new approach to teaching AI agents what they know. We construct multi-agent collaborations in which the group is rewarded for collectively arriving at correct answers. The desired meta-knowledge emerges from the incentives built into the structure of the interaction.
arXiv Detail & Related papers (2025-03-18T17:53:20Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors [93.38830440346783]
We propose a multi-agent framework framework that can collaboratively adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that framework framework can effectively deploy multi-agent groups that outperform a single agent. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups.
arXiv Detail & Related papers (2023-08-21T16:47:11Z)
Consensus Learning for Cooperative Multi-Agent Reinforcement Learning [12.74348597962689]
We propose consensus learning for cooperative multi-agent reinforcement learning. We feed the inferred consensus as an explicit input to the network of agents. Our proposed method can be extended to various multi-agent reinforcement learning algorithms.
arXiv Detail & Related papers (2022-06-06T12:43:07Z)
LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL. To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy. We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z)
Decentralized Cooperative Multi-Agent Reinforcement Learning with Exploration [35.75029940279768]
We study multi-agent reinforcement learning in the most basic cooperative setting -- Markov teams. We propose an algorithm in which each agent independently runs a stage-based V-learning style algorithm. We show that the agents can learn an $epsilon$-approximate Nash equilibrium policy in at most $proptowidetildeO (1/epsilon4)$ episodes.
arXiv Detail & Related papers (2021-10-12T02:45:12Z)
Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards. We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences. We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z)
Two-stage training algorithm for AI robot soccer [2.0757564643017092]
Two-stage heterogeneous centralized training is proposed to improve the learning performance of heterogeneous agents. The proposed method is applied to 5 versus 5 AI robot soccer for validation.
arXiv Detail & Related papers (2021-04-13T04:24:13Z)
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn) UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features. Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
Learning to Incentivize Other Learning Agents [73.03133692589532]
We show how to equip RL agents with the ability to give rewards directly to other agents, using a learned incentive function. Such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games. Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.
arXiv Detail & Related papers (2020-06-10T20:12:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.