Cooperative and Competitive Biases for Multi-Agent Reinforcement
Learning
- URL: http://arxiv.org/abs/2101.06890v1
- Date: Mon, 18 Jan 2021 05:52:22 GMT
- Title: Cooperative and Competitive Biases for Multi-Agent Reinforcement
Learning
- Authors: Heechang Ryu, Hayong Shin, Jinkyoo Park
- Abstract summary: Training a multi-agent reinforcement learning (MARL) algorithm is more challenging than training a single-agent reinforcement learning algorithm.
We propose an algorithm that boosts MARL training using the biased action information of other agents based on a friend-or-foe concept.
We empirically demonstrate that our algorithm outperforms existing algorithms in various mixed cooperative-competitive environments.
- Score: 12.676356746752893
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Training a multi-agent reinforcement learning (MARL) algorithm is more
challenging than training a single-agent reinforcement learning algorithm,
because the result of a multi-agent task strongly depends on the complex
interactions among agents and their interactions with a stochastic and dynamic
environment. We propose an algorithm that boosts MARL training using the biased
action information of other agents based on a friend-or-foe concept. For a
cooperative and competitive environment, there are generally two groups of
agents: cooperative-agents and competitive-agents. In the proposed algorithm,
each agent updates its value function using its own action and the biased
action information of other agents in the two groups. The biased joint action
of cooperative agents is computed as the sum of their actual joint action and
the imaginary cooperative joint action, by assuming all the cooperative agents
jointly maximize the target agent's value function. The biased joint action of
competitive agents can be computed similarly. Each agent then updates its own
value function using the biased action information, resulting in a biased value
function and corresponding biased policy. Subsequently, the biased policy of
each agent is inevitably subjected to recommend an action to cooperate and
compete with other agents, thereby introducing more active interactions among
agents and enhancing the MARL policy learning. We empirically demonstrate that
our algorithm outperforms existing algorithms in various mixed
cooperative-competitive environments. Furthermore, the introduced biases
gradually decrease as the training proceeds and the correction based on the
imaginary assumption vanishes.
Related papers
- Situation-Dependent Causal Influence-Based Cooperative Multi-agent
Reinforcement Learning [18.054709749075194]
We propose a novel MARL algorithm named Situation-Dependent Causal Influence-Based Cooperative Multi-agent Reinforcement Learning (SCIC)
Our approach aims to detect inter-agent causal influences in specific situations based on the criterion using causal intervention and conditional mutual information.
The resulting update links coordinated exploration and intrinsic reward distribution, which enhance overall collaboration and performance.
arXiv Detail & Related papers (2023-12-15T05:09:32Z) - DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement
Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents.
We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z) - Fact-based Agent modeling for Multi-Agent Reinforcement Learning [6.431977627644292]
Fact-based Agent modeling (FAM) method is proposed in which fact-based belief inference (FBI) network models other agents in partially observable environment only based on its local information.
We evaluate FAM on various Multiagent Particle Environment (MPE) and compare the results with several state-of-the-art MARL algorithms.
arXiv Detail & Related papers (2023-10-18T19:43:38Z) - ProAgent: Building Proactive Cooperative Agents with Large Language
Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents.
ProAgent can analyze the present state, and infer the intentions of teammates from observations.
ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z) - AgentVerse: Facilitating Multi-Agent Collaboration and Exploring
Emergent Behaviors [93.38830440346783]
We propose a multi-agent framework framework that can collaboratively adjust its composition as a greater-than-the-sum-of-its-parts system.
Our experiments demonstrate that framework framework can effectively deploy multi-agent groups that outperform a single agent.
In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups.
arXiv Detail & Related papers (2023-08-21T16:47:11Z) - Iterated Reasoning with Mutual Information in Cooperative and Byzantine
Decentralized Teaming [0.0]
We show that reformulating an agent's policy to be conditional on the policies of its teammates inherently maximizes Mutual Information (MI) lower-bound when optimizing under Policy Gradient (PG)
Our approach, InfoPG, outperforms baselines in learning emergent collaborative behaviors and sets the state-of-the-art in decentralized cooperative MARL tasks.
arXiv Detail & Related papers (2022-01-20T22:54:32Z) - Balancing Rational and Other-Regarding Preferences in
Cooperative-Competitive Environments [4.705291741591329]
Mixed environments are notorious for the conflicts of selfish and social interests.
We propose BAROCCO to balance individual and social incentives.
Our meta-algorithm is compatible with both Q-learning and Actor-Critic frameworks.
arXiv Detail & Related papers (2021-02-24T14:35:32Z) - Modeling the Interaction between Agents in Cooperative Multi-Agent
Reinforcement Learning [2.9360071145551068]
We propose a novel cooperative MARL algorithm named as interactive actor-critic(IAC)
IAC models the interaction of agents from perspectives of policy and value function.
We extend the value decomposition methods to continuous control tasks and evaluate IAC on benchmark tasks including classic control and multi-agent particle environments.
arXiv Detail & Related papers (2021-02-10T01:58:28Z) - Multi-agent Policy Optimization with Approximatively Synchronous
Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly.
In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously.
In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - On Emergent Communication in Competitive Multi-Agent Teams [116.95067289206919]
We investigate whether competition for performance from an external, similar agent team could act as a social influence.
Our results show that an external competitive influence leads to improved accuracy and generalization, as well as faster emergence of communicative languages.
arXiv Detail & Related papers (2020-03-04T01:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.