Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing
- URL: http://arxiv.org/abs/2309.08477v1
- Date: Thu, 14 Sep 2023 01:18:04 GMT
- Title: Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing
- Authors: Hadar Szostak and Kobi Cohen
- Abstract summary: We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning.
We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
- Score: 11.639503711252663
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider a decentralized formulation of the active hypothesis testing
(AHT) problem, where multiple agents gather noisy observations from the
environment with the purpose of identifying the correct hypothesis. At each
time step, agents have the option to select a sampling action. These different
actions result in observations drawn from various distributions, each
associated with a specific hypothesis. The agents collaborate to accomplish the
task, where message exchanges between agents are allowed over a rate-limited
communications channel. The objective is to devise a multi-agent policy that
minimizes the Bayes risk. This risk comprises both the cost of sampling and the
joint terminal cost incurred by the agents upon making a hypothesis
declaration. Deriving optimal structured policies for AHT problems is generally
mathematically intractable, even in the context of a single agent. As a result,
recent efforts have turned to deep learning methodologies to address these
problems, which have exhibited significant success in single-agent learning
scenarios. In this paper, we tackle the multi-agent AHT formulation by
introducing a novel algorithm rooted in the framework of deep multi-agent
reinforcement learning. This algorithm, named Multi-Agent Reinforcement
Learning for AHT (MARLA), operates at each time step by having each agent map
its state to an action (sampling rule or stopping rule) using a trained deep
neural network with the goal of minimizing the Bayes risk. We present a
comprehensive set of experimental results that effectively showcase the agents'
ability to learn collaborative strategies and enhance performance using MARLA.
Furthermore, we demonstrate the superiority of MARLA over single-agent learning
approaches. Finally, we provide an open-source implementation of the MARLA
framework, for the benefit of researchers and developers in related domains.
Related papers
- From Novice to Expert: LLM Agent Policy Optimization via Step-wise Reinforcement Learning [62.54484062185869]
We introduce StepAgent, which utilizes step-wise reward to optimize the agent's reinforcement learning process.
We propose implicit-reward and inverse reinforcement learning techniques to facilitate agent reflection and policy adjustment.
arXiv Detail & Related papers (2024-11-06T10:35:11Z) - Cooperative Reward Shaping for Multi-Agent Pathfinding [4.244426154524592]
The primary objective of Multi-Agent Pathfinding (MAPF) is to plan efficient and conflict-free paths for all agents.
Traditional multi-agent path planning algorithms struggle to achieve efficient distributed path planning for multiple agents.
This letter introduces a unique reward shaping technique based on Independent Q-Learning (IQL)
arXiv Detail & Related papers (2024-07-15T02:44:41Z) - SACHA: Soft Actor-Critic with Heuristic-Based Attention for Partially
Observable Multi-Agent Path Finding [3.4260993997836753]
We propose a novel multi-agent actor-critic method called Soft Actor-Critic with Heuristic-Based Attention (SACHA)
SACHA learns a neural network for each agent to selectively pay attention to the shortest path guidance from multiple agents within its field of view.
We demonstrate decent improvements over several state-of-the-art learning-based MAPF methods with respect to success rate and solution quality.
arXiv Detail & Related papers (2023-07-05T23:36:33Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Graph Exploration for Effective Multi-agent Q-Learning [46.723361065955544]
This paper proposes an exploration technique for multi-agent reinforcement learning (MARL) with graph-based communication among agents.
We assume the individual rewards received by the agents are independent of the actions by the other agents, while their policies are coupled.
In the proposed framework, neighbouring agents collaborate to estimate the uncertainty about the state-action space in order to execute more efficient explorative behaviour.
arXiv Detail & Related papers (2023-04-19T10:28:28Z) - Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space.
Also, we propose a novel framework to adopt the multi-agent options in the MARL process.
We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z) - Learning Cooperative Multi-Agent Policies with Partial Reward Decoupling [13.915157044948364]
One of the preeminent obstacles to scaling multi-agent reinforcement learning is assigning credit to individual agents' actions.
In this paper, we address this credit assignment problem with an approach that we call textitpartial reward decoupling (PRD)
PRD decomposes large cooperative multi-agent RL problems into decoupled subproblems involving subsets of agents, thereby simplifying credit assignment.
arXiv Detail & Related papers (2021-12-23T17:48:04Z) - Trust Region Policy Optimisation in Multi-Agent Reinforcement Learning [25.027143431992755]
Trust region methods rigorously enabled reinforcement learning (RL) agents to learn monotonically improving policies, leading to superior performance on a variety of tasks.
Unfortunately, when it comes to multi-agent reinforcement learning (MARL), the property of monotonic improvement may not simply apply.
In this paper, we extend the theory of trust region learning to MARL. Central to our findings are the multi-agent advantage decomposition lemma and the sequential policy update scheme.
Based on these, we develop Heterogeneous-Agent Trust Region Policy optimisation (HATPRO) and Heterogeneous-Agent Proximal Policy optimisation (
arXiv Detail & Related papers (2021-09-23T09:44:35Z) - Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting
Pot [71.28884625011987]
Melting Pot is a MARL evaluation suite that uses reinforcement learning to reduce the human labor required to create novel test scenarios.
We have created over 80 unique test scenarios covering a broad range of research topics.
We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.
arXiv Detail & Related papers (2021-07-14T17:22:14Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic [54.2180984002807]
Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems.
We propose a multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works.
arXiv Detail & Related papers (2020-02-24T20:30:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.