Related papers: Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2109.10632v1
Date: Wed, 22 Sep 2021 10:08:15 GMT
Title: Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning
Authors: Roy Zohar, Shie Mannor, Guy Tennenholtz
Abstract summary: Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents. We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
Score: 52.7873574425376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents. As environments grow in size, effective credit assignment becomes increasingly harder and often results in infeasible learning times. Still, in many real-world settings, there exist simplified underlying dynamics that can be leveraged for more scalable solutions. In this work, we exploit such locality structures effectively whilst maintaining global cooperation. We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Centralized Training Decentralized Execution paradigm. Additionally, we provide a direct reward decomposition method for finding these local rewards when only a global signal is provided. We test our method empirically, showing it scales well compared to other methods, significantly improving performance and convergence speed.

Related papers

Learning Multi-Robot Coordination through Locality-Based Factorized Multi-Agent Actor-Critic Algorithm [54.98788921815576]
We present a novel cooperative multi-agent reinforcement learning method called textbfLocality based textbfFactorized textbfMulti-Agent textbfActor-textbfCritic (Loc-FACMAC)<n>We integrate the concept of locality into critic learning, where strongly related robots form partitions during training.<n>Our method improves existing algorithms by focusing on local rewards and leveraging partition-based learning to enhance training efficiency and performance.
arXiv Detail & Related papers (2025-03-24T16:00:16Z)
Leveraging Large Language Models for Effective and Explainable Multi-Agent Credit Assignment [4.406086834602686]
We show how to reformulate credit assignment to the two pattern recognition problems of sequence improvement and attribution. Our approach utilizes a centralized reward-critic which numerically decomposes the environment reward based on the individual contribution of each agent. Both our methods far outperform the state-of-the-art on a variety of benchmarks, including Level-Based Foraging, Robotic Warehouse, and our new Spaceworld benchmark which incorporates collision-related safety constraints.
arXiv Detail & Related papers (2025-02-24T05:56:47Z)
United We Stand: Decentralized Multi-Agent Planning With Attrition [4.196094610996091]
Decentralized planning is a key element of cooperative multi-agent systems for information gathering tasks. We propose Attritable MCTS, a decentralized algorithm capable of timely and efficient adaptation to changes in the set of active agents. We show both theoretically and experimentally that A-MCTS enables efficient adaptation even under high failure rates.
arXiv Detail & Related papers (2024-07-11T07:55:50Z)
Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z)
Scalable Multi-Agent Model-Based Reinforcement Learning [1.95804735329484]
We propose a new method called MAMBA which utilizes Model-Based Reinforcement Learning (MBRL) to further leverage centralized training in cooperative environments. We argue that communication between agents is enough to sustain a world model for each agent during execution phase while imaginary rollouts can be used for training, removing the necessity to interact with the environment.
arXiv Detail & Related papers (2022-05-25T08:35:00Z)
Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming [41.30044824711509]
We focus on the case that the global reward is a sum of local rewards, the joint policy factorizes into agents' marginals, and full state observability. We develop multi-agent extensions, whereby agents solve their local saddle point problems and then perform local weighted averaging. We establish that the sample complexity to obtain near-globally optimal solutions matches tight dependencies on the cardinality of the state and action spaces.
arXiv Detail & Related papers (2021-10-22T03:48:41Z)
Softmax with Regularization: Better Value Estimation in Multi-Agent Reinforcement Learning [72.28520951105207]
Overestimation in $Q$-learning is an important problem that has been extensively studied in single-agent reinforcement learning. We propose a novel regularization-based update scheme that penalizes large joint action-values deviating from a baseline. We show that our method provides a consistent performance improvement on a set of challenging StarCraft II micromanagement tasks.
arXiv Detail & Related papers (2021-03-22T14:18:39Z)
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations [55.41644538483948]
In this paper, we propose a combination of approaches that allow the agent to use low-quality demonstrations in complex vision-based environments. Our proposed goal-oriented structuring of replay buffer allows the agent to automatically highlight sub-goals for solving complex hierarchical tasks in demonstrations. The solution based on our algorithm beats all the solutions for the famous MineRL competition and allows the agent to mine a diamond in the Minecraft environment.
arXiv Detail & Related papers (2020-06-17T15:38:40Z)
Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward [17.925681736096482]
It has long been recognized that multi-agent reinforcement learning (MARL) faces significant scalability issues. In this paper, we identify a rich class of networked MARL problems where the model exhibits a local dependence structure that allows it to be solved in a scalable manner.
arXiv Detail & Related papers (2020-06-11T17:23:17Z)
F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications. We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting. Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning [55.20040781688844]
QMIX is a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. We propose the StarCraft Multi-Agent Challenge (SMAC) as a new benchmark for deep multi-agent reinforcement learning.
arXiv Detail & Related papers (2020-03-19T16:51:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.