Provably Efficient Multi-Agent Reinforcement Learning with Fully
Decentralized Communication
- URL: http://arxiv.org/abs/2110.07392v1
- Date: Thu, 14 Oct 2021 14:27:27 GMT
- Title: Provably Efficient Multi-Agent Reinforcement Learning with Fully
Decentralized Communication
- Authors: Justin Lidard, Udari Madhushani, Naomi Ehrich Leonard
- Abstract summary: Distributed exploration reduces sampling complexity in reinforcement learning.
We show that group performance can be significantly improved when each agent uses a decentralized message-passing protocol.
We show that incorporating more agents and more information sharing into the group learning scheme speeds up convergence to the optimal policy.
- Score: 3.5450828190071655
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A challenge in reinforcement learning (RL) is minimizing the cost of sampling
associated with exploration. Distributed exploration reduces sampling
complexity in multi-agent RL (MARL). We investigate the benefits to performance
in MARL when exploration is fully decentralized. Specifically, we consider a
class of online, episodic, tabular $Q$-learning problems under time-varying
reward and transition dynamics, in which agents can communicate in a
decentralized manner.We show that group performance, as measured by the bound
on regret, can be significantly improved through communication when each agent
uses a decentralized message-passing protocol, even when limited to sending
information up to its $\gamma$-hop neighbors. We prove regret and sample
complexity bounds that depend on the number of agents, communication network
structure and $\gamma.$ We show that incorporating more agents and more
information sharing into the group learning scheme speeds up convergence to the
optimal policy. Numerical simulations illustrate our results and validate our
theoretical claims.
Related papers
- Compressed Federated Reinforcement Learning with a Generative Model [11.074080383657453]
Reinforcement learning has recently gained unprecedented popularity, yet it still grapples with sample inefficiency.
Addressing this challenge, federated reinforcement learning (FedRL) has emerged, wherein agents collaboratively learn a single policy by aggregating local estimations.
We propose CompFedRL, a communication-efficient FedRL approach incorporating both textitperiodic aggregation and (direct/error-feedback) compression mechanisms.
arXiv Detail & Related papers (2024-03-26T15:36:47Z) - Deep Multi-Agent Reinforcement Learning for Decentralized Active
Hypothesis Testing [11.639503711252663]
We tackle the multi-agent active hypothesis testing (AHT) problem by introducing a novel algorithm rooted in the framework of deep multi-agent reinforcement learning.
We present a comprehensive set of experimental results that effectively showcase the agents' ability to learn collaborative strategies and enhance performance.
arXiv Detail & Related papers (2023-09-14T01:18:04Z) - On the Performance of Gradient Tracking with Local Updates [10.14746251086461]
We show that LU-GT has the same communication quality but allows arbitrary network topologies.
Numerical examples reveal that local updates can lower communication costs in certain regimes.
arXiv Detail & Related papers (2022-10-10T15:13:23Z) - Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure
of Costs [65.23158435596518]
Solving the multi-vehicle routing problem as a team Markov game with partially observable costs.
Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions.
arXiv Detail & Related papers (2022-06-13T09:17:40Z) - Modeling Bounded Rationality in Multi-Agent Simulations Using Rationally
Inattentive Reinforcement Learning [85.86440477005523]
We study more human-like RL agents which incorporate an established model of human-irrationality, the Rational Inattention (RI) model.
RIRL models the cost of cognitive information processing using mutual information.
We show that using RIRL yields a rich spectrum of new equilibrium behaviors that differ from those found under rational assumptions.
arXiv Detail & Related papers (2022-01-18T20:54:00Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Accelerating Distributed Online Meta-Learning via Multi-Agent
Collaboration under Limited Communication [24.647993999787992]
We propose a multi-agent online meta-learning framework and cast it as an equivalent two-level nested online convex optimization (OCO) problem.
By characterizing the upper bound of the agent-task-averaged regret, we show that the performance of multi-agent online meta-learning depends heavily on how much an agent can benefit from the distributed network-level OCO for meta-model updates via limited communication.
We show that a factor of $sqrt1/N$ speedup over the optimal single-agent regret $O(sqrtT)$ after $
arXiv Detail & Related papers (2020-12-15T23:08:36Z) - Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML.
We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML.
Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.