Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic
- URL: http://arxiv.org/abs/2002.10525v1
- Date: Mon, 24 Feb 2020 20:30:45 GMT
- Title: Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic
- Authors: Wonseok Jeon, Paul Barde, Derek Nowrouzezahrai, Joelle Pineau
- Abstract summary: Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems.
We propose a multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works.
- Score: 54.2180984002807
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent
approach that applies single-agent AIRL to multi-agent problems where we seek
to recover both policies for our agents and reward functions that promote
expert-like behavior. While MA-AIRL has promising results on cooperative and
competitive tasks, it is sample-inefficient and has only been validated
empirically for small numbers of agents -- its ability to scale to many agents
remains an open question. We propose a multi-agent inverse RL algorithm that is
more sample-efficient and scalable than previous works. Specifically, we employ
multi-agent actor-attention-critic (MAAC) -- an off-policy multi-agent RL
(MARL) method -- for the RL inner loop of the inverse RL procedure. In doing
so, we are able to increase sample efficiency compared to state-of-the-art
baselines, across both small- and large-scale tasks. Moreover, the RL agents
trained on the rewards recovered by our method better match the experts than
those trained on the rewards derived from the baselines. Finally, our method
requires far fewer agent-environment interactions, particularly as the number
of agents increases.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Learning Emergence of Interaction Patterns across Independent RL Agents in Multi-Agent Environments [3.0284592792243794]
Bottom Up Network (BUN) treats the collective of multi-agents as a unified entity.
Our empirical evaluations across a variety of cooperative multi-agent scenarios, including tasks such as cooperative navigation and traffic control, consistently demonstrate BUN's superiority over baseline methods with substantially reduced computational costs.
arXiv Detail & Related papers (2024-10-03T14:25:02Z) - Selectively Sharing Experiences Improves Multi-Agent Reinforcement Learning [9.25057318925143]
We present a novel multi-agent RL approach, in which agents share with other agents a limited number of transitions they observe during training.
We show that our approach outperforms baseline no-sharing decentralized training and state-of-the art multi-agent RL algorithms.
arXiv Detail & Related papers (2023-11-01T21:35:32Z) - Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning.
We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning [90.43925357575543]
We propose ranked policy memory ( RPM) to collect diverse multi-agent trajectories for training MARL policies with good generalizability.
RPM enables MARL agents to interact with unseen agents in multi-agent generalization evaluation scenarios and complete given tasks, and it significantly boosts the performance up to 402% on average.
arXiv Detail & Related papers (2022-10-18T07:32:43Z) - Learning Cooperative Multi-Agent Policies with Partial Reward Decoupling [13.915157044948364]
One of the preeminent obstacles to scaling multi-agent reinforcement learning is assigning credit to individual agents' actions.
In this paper, we address this credit assignment problem with an approach that we call textitpartial reward decoupling (PRD)
PRD decomposes large cooperative multi-agent RL problems into decoupled subproblems involving subsets of agents, thereby simplifying credit assignment.
arXiv Detail & Related papers (2021-12-23T17:48:04Z) - Plan Better Amid Conservatism: Offline Multi-Agent Reinforcement
Learning with Actor Rectification [74.10976684469435]
offline reinforcement learning (RL) algorithms can be transferred to multi-agent settings directly.
We propose a simple yet effective method, Offline Multi-Agent RL with Actor Rectification (OMAR), to tackle this critical challenge.
OMAR significantly outperforms strong baselines with state-of-the-art performance in multi-agent continuous control benchmarks.
arXiv Detail & Related papers (2021-11-22T13:27:42Z) - Learning to Incentivize Other Learning Agents [73.03133692589532]
We show how to equip RL agents with the ability to give rewards directly to other agents, using a learned incentive function.
Such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games.
Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.
arXiv Detail & Related papers (2020-06-10T20:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.