Information State Embedding in Partially Observable Cooperative
Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2004.01098v3
- Date: Mon, 17 Aug 2020 03:55:46 GMT
- Title: Information State Embedding in Partially Observable Cooperative
Multi-Agent Reinforcement Learning
- Authors: Weichao Mao, Kaiqing Zhang, Erik Miehling, Tamer Ba\c{s}ar
- Abstract summary: We introduce the concept of an information state embedding that serves to compress agents' histories.
We quantify how the compression error influences the resulting value functions for decentralized control.
The proposed embed-then-learn pipeline opens the black-box of existing (partially observable) MARL algorithms.
- Score: 19.617644643147948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-agent reinforcement learning (MARL) under partial observability has
long been considered challenging, primarily due to the requirement for each
agent to maintain a belief over all other agents' local histories -- a domain
that generally grows exponentially over time. In this work, we investigate a
partially observable MARL problem in which agents are cooperative. To enable
the development of tractable algorithms, we introduce the concept of an
information state embedding that serves to compress agents' histories. We
quantify how the compression error influences the resulting value functions for
decentralized control. Furthermore, we propose an instance of the embedding
based on recurrent neural networks (RNNs). The embedding is then used as an
approximate information state, and can be fed into any MARL algorithm. The
proposed embed-then-learn pipeline opens the black-box of existing (partially
observable) MARL algorithms, allowing us to establish some theoretical
guarantees (error bounds of value functions) while still achieving competitive
performance with many end-to-end approaches.
Related papers
- Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs [49.71319907864573]
In this paper, we propose multi-agent skill discovery which enables the ease of decomposition.
Our key idea is to approximate the joint state space as a Kronecker graph, based on which we can directly estimate its Fiedler vector.
Considering that directly computing the Laplacian spectrum is intractable for tasks with infinite-scale state spaces, we further propose a deep learning extension of our method.
arXiv Detail & Related papers (2023-07-21T14:53:12Z) - An Algorithm For Adversary Aware Decentralized Networked MARL [0.0]
We introduce vulnerabilities in the consensus updates of existing MARL algorithms.
We provide an algorithm that allows non-adversarial agents to reach a consensus in the presence of adversaries.
arXiv Detail & Related papers (2023-05-09T16:02:31Z) - PAC: Assisted Value Factorisation with Counterfactual Predictions in
Multi-Agent Reinforcement Learning [43.862956745961654]
Multi-agent reinforcement learning (MARL) has witnessed significant progress with the development of value function factorization methods.
In this paper, we show that in partially observable MARL problems, an agent's ordering over its own actions could impose concurrent constraints.
We propose PAC, a new framework leveraging information generated from Counterfactual Predictions of optimal joint action selection.
arXiv Detail & Related papers (2022-06-22T23:34:30Z) - RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in
Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios.
RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents.
Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z) - On the Use and Misuse of Absorbing States in Multi-agent Reinforcement
Learning [55.95253619768565]
Current MARL algorithms assume that the number of agents within a group remains fixed throughout an experiment.
In many practical problems, an agent may terminate before their teammates.
We present a novel architecture for an existing state-of-the-art MARL algorithm which uses attention instead of a fully connected layer with absorbing states.
arXiv Detail & Related papers (2021-11-10T23:45:08Z) - Cooperative Multi-Agent Reinforcement Learning Based Distributed Dynamic
Spectrum Access in Cognitive Radio Networks [46.723006378363785]
Dynamic spectrum access (DSA) is a promising paradigm to remedy the problem of inefficient spectrum utilization.
In this paper, we investigate the distributed DSA problem for multi-user in a typical cognitive radio network.
We employ the deep recurrent Q-network (DRQN) to address the partial observability of the state for each cognitive user.
arXiv Detail & Related papers (2021-06-17T06:52:21Z) - Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML.
We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML.
Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z) - R-MADDPG for Partially Observable Environments and Limited Communication [42.771013165298186]
This paper introduces a deep recurrent multiagent actor-critic framework (R-MADDPG) for handling multiagent coordination under partial observable set-tings and limited communication.
We demonstrate that the resulting framework learns time dependencies for sharing missing observations, handling resource limitations, and developing different communication patterns among agents.
arXiv Detail & Related papers (2020-02-16T21:25:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.