MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2205.12449v1
- Date: Wed, 25 May 2022 02:38:10 GMT
- Title: MAVIPER: Learning Decision Tree Policies for Interpretable Multi-Agent
Reinforcement Learning
- Authors: Stephanie Milani and Zhicheng Zhang and Nicholay Topin and Zheyuan
Ryan Shi and Charles Kamhoua and Evangelos E. Papalexakis and Fei Fang
- Abstract summary: We propose the first set of interpretable MARL algorithms that extract decision-tree policies from neural networks trained with MARL.
The first algorithm, IVIPER, extends VIPER, a recent method for single-agent interpretable RL, to the multi-agent setting.
To better capture coordination between agents, we propose a novel centralized decision-tree training algorithm, MAVIPER.
- Score: 38.77840067555711
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many recent breakthroughs in multi-agent reinforcement learning (MARL)
require the use of deep neural networks, which are challenging for human
experts to interpret and understand. On the other hand, existing work on
interpretable RL has shown promise in extracting more interpretable decision
tree-based policies, but only in the single-agent setting. To fill this gap, we
propose the first set of interpretable MARL algorithms that extract
decision-tree policies from neural networks trained with MARL. The first
algorithm, IVIPER, extends VIPER, a recent method for single-agent
interpretable RL, to the multi-agent setting. We demonstrate that IVIPER can
learn high-quality decision-tree policies for each agent. To better capture
coordination between agents, we propose a novel centralized decision-tree
training algorithm, MAVIPER. MAVIPER jointly grows the trees of each agent by
predicting the behavior of the other agents using their anticipated trees, and
uses resampling to focus on states that are critical for its interactions with
other agents. We show that both algorithms generally outperform the baselines
and that MAVIPER-trained agents achieve better-coordinated performance than
IVIPER-trained agents on three different multi-agent particle-world
environments.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Ensembling Prioritized Hybrid Policies for Multi-agent Pathfinding [18.06081009550052]
Multi-Agent Reinforcement Learning (MARL) based Multi-Agent Path Finding (MAPF) has recently gained attention due to its efficiency and scalability.
Several MARL-MAPF methods choose to use communication to enrich the information one agent can perceive.
We propose a new method, Ensembling Prioritized Hybrid Policies (EPH)
arXiv Detail & Related papers (2024-03-12T11:47:12Z) - Decentralized Monte Carlo Tree Search for Partially Observable
Multi-agent Pathfinding [49.730902939565986]
Multi-Agent Pathfinding problem involves finding a set of conflict-free paths for a group of agents confined to a graph.
In this study, we focus on the decentralized MAPF setting, where the agents may observe the other agents only locally.
We propose a decentralized multi-agent Monte Carlo Tree Search (MCTS) method for MAPF tasks.
arXiv Detail & Related papers (2023-12-26T06:57:22Z) - Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning.
We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Recursive Reasoning Graph for Multi-Agent Reinforcement Learning [44.890087638530524]
Multi-agent reinforcement learning (MARL) provides an efficient way for simultaneously learning policies for multiple agents interacting with each other.
Existing algorithms can suffer from an inability to accurately anticipate the influence of self-actions on other agents.
The proposed algorithm, referred to as the Recursive Reasoning Graph (R2G), shows state-of-the-art performance on multiple multi-agent particle and robotics games.
arXiv Detail & Related papers (2022-03-06T00:57:50Z) - SA-MATD3:Self-attention-based multi-agent continuous control method in
cooperative environments [12.959163198988536]
Existing algorithms suffer from the problem of uneven learning degree with the increase of the number of agents.
A new structure for a multi-agent actor critic is proposed, and the self-attention mechanism is applied in the critic network.
The proposed algorithm makes full use of the samples in the replay memory buffer to learn the behavior of a class of agents.
arXiv Detail & Related papers (2021-07-01T08:15:05Z) - Agent-Centric Representations for Multi-Agent Reinforcement Learning [12.577354830985012]
We investigate whether object-centric representations are also beneficial in the fully cooperative multi-agent reinforcement learning setting.
Specifically, we study two ways of incorporating an agent-centric inductive bias into our RL algorithm.
We evaluate these approaches on the Google Research Football environment as well as DeepMind Lab 2D.
arXiv Detail & Related papers (2021-04-19T15:43:40Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic [54.2180984002807]
Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems.
We propose a multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works.
arXiv Detail & Related papers (2020-02-24T20:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.