Learning to Share in Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2112.08702v1
- Date: Thu, 16 Dec 2021 08:43:20 GMT
- Title: Learning to Share in Multi-Agent Reinforcement Learning
- Authors: Yuxuan Yi, Ge Li, Yaowei Wang, Zongqing Lu
- Abstract summary: We study the problem of networked multi-agent reinforcement learning (MARL)
Inspired by the fact that textitsharing plays a key role in human's learning of cooperation, we propose LToS.
For each agent, the high-level policy learns how to share reward with neighbors to decompose the global objective, while the low-level policy learns to optimize local objective induced by the high-level policies in the neighborhood.
- Score: 33.462384381980065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the problem of networked multi-agent reinforcement
learning (MARL), where a number of agents are deployed as a partially connected
network and each interacts only with nearby agents. Networked MARL requires all
agents make decision in a decentralized manner to optimize a global objective
with restricted communication between neighbors over the network. Inspired by
the fact that \textit{sharing} plays a key role in human's learning of
cooperation, we propose LToS, a hierarchically decentralized MARL framework
that enables agents to learn to dynamically share reward with neighbors so as
to encourage agents to cooperate on the global objective. For each agent, the
high-level policy learns how to share reward with neighbors to decompose the
global objective, while the low-level policy learns to optimize local objective
induced by the high-level policies in the neighborhood. The two policies form a
bi-level optimization and learn alternately. We empirically demonstrate that
LToS outperforms existing methods in both social dilemma and networked MARL
scenario.
Related papers
- ComaDICE: Offline Cooperative Multi-Agent Reinforcement Learning with Stationary Distribution Shift Regularization [11.620274237352026]
offline reinforcement learning (RL) has garnered significant attention for its ability to learn effective policies from pre-collected datasets.
MARL presents additional challenges due to the large joint state-action space and the complexity of multi-agent behaviors.
We introduce a regularizer in the space of stationary distributions to better handle distributional shift.
arXiv Detail & Related papers (2024-10-02T18:56:10Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Global Convergence of Localized Policy Iteration in Networked
Multi-Agent Reinforcement Learning [25.747559058350557]
We study a multi-agent reinforcement learning (MARL) problem where the agents interact over a given network.
The goal of the agents is to cooperatively maximize the average of their entropy-regularized long-term rewards.
To overcome the curse of dimensionality and to reduce communication, we propose a Localized Policy Iteration (LPI) that provably learns a near-globally-optimal policy using only local information.
arXiv Detail & Related papers (2022-11-30T15:58:00Z) - Learning From Good Trajectories in Offline Multi-Agent Reinforcement
Learning [98.07495732562654]
offline multi-agent reinforcement learning (MARL) aims to learn effective multi-agent policies from pre-collected datasets.
One agent learned by offline MARL often inherits this random policy, jeopardizing the performance of the entire team.
We propose a novel framework called Shared Individual Trajectories (SIT) to address this problem.
arXiv Detail & Related papers (2022-11-28T18:11:26Z) - Local Advantage Networks for Cooperative Multi-Agent Reinforcement
Learning [1.1879716317856945]
This paper presents a new type of reinforcement learning algorithm for cooperative partially observable environments.
We use a dueling architecture to learn for each agent a decentralized best-response policies via individual advantage functions.
Evaluation on the StarCraft II multi-agent challenge benchmark shows that LAN reaches state-of-the-art performance.
arXiv Detail & Related papers (2021-12-23T10:55:33Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - AoI-Aware Resource Allocation for Platoon-Based C-V2X Networks via
Multi-Agent Multi-Task Reinforcement Learning [22.890835786710316]
This paper investigates the problem of age of information (AoI) aware radio resource management for a platooning system.
Multiple autonomous platoons exploit the cellular wireless vehicle-to-everything (C-V2X) communication technology to disseminate the cooperative awareness messages (CAMs) to their followers.
We exploit a distributed resource allocation framework based on multi-agent reinforcement learning (MARL), where each platoon leader (PL) acts as an agent and interacts with the environment to learn its optimal policy.
arXiv Detail & Related papers (2021-05-10T08:39:56Z) - Competing Adaptive Networks [56.56653763124104]
We develop an algorithm for decentralized competition among teams of adaptive agents.
We present an application in the decentralized training of generative adversarial neural networks.
arXiv Detail & Related papers (2021-03-29T14:42:15Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.