Decentralized Reinforcement Learning for Multi-Agent Multi-Resource Allocation via Dynamic Cluster Agreements
- URL: http://arxiv.org/abs/2503.02437v1
- Date: Tue, 04 Mar 2025 09:23:48 GMT
- Title: Decentralized Reinforcement Learning for Multi-Agent Multi-Resource Allocation via Dynamic Cluster Agreements
- Authors: Antonio Marino, Esteban Restrepo, Claudio Pacchierotti, Paolo Robuffo Giordano,
- Abstract summary: This paper addresses the challenge of allocating heterogeneous resources among multiple agents in a decentralized manner.<n>Our proposed method, LGTC-IPPO, builds upon Independent Proximal Policy Optimization (IPPO) by integrating dynamic cluster consensus.<n> Experimental results demonstrate that LGTC-IPPO achieves more stable rewards, better coordination, and robust performance even as the number of agents or resource types increases.
- Score: 13.586104024751586
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper addresses the challenge of allocating heterogeneous resources among multiple agents in a decentralized manner. Our proposed method, LGTC-IPPO, builds upon Independent Proximal Policy Optimization (IPPO) by integrating dynamic cluster consensus, a mechanism that allows agents to form and adapt local sub-teams based on resource demands. This decentralized coordination strategy reduces reliance on global information and enhances scalability. We evaluate LGTC-IPPO against standard multi-agent reinforcement learning baselines and a centralized expert solution across a range of team sizes and resource distributions. Experimental results demonstrate that LGTC-IPPO achieves more stable rewards, better coordination, and robust performance even as the number of agents or resource types increases. Additionally, we illustrate how dynamic clustering enables agents to reallocate resources efficiently also for scenarios with discharging resources.
Related papers
- Adaptive AI-based Decentralized Resource Management in the Cloud-Edge Continuum [4.989052212674281]
Cloud-Edge Continuum presents significant challenges for efficient resource management.<n>Traditional centralized approaches struggle to adapt to these changes due to their static nature.<n>This paper proposes a hybrid decentralized framework for dynamic application placement and resource management.
arXiv Detail & Related papers (2025-01-27T06:07:09Z) - Hierarchical Reinforcement Learning for Optimal Agent Grouping in Cooperative Systems [0.4759142872591625]
This paper presents a hierarchical reinforcement learning (RL) approach to address the agent grouping or pairing problem in cooperative multi-agent systems.<n>By employing a hierarchical RL framework, we distinguish between high-level decisions of grouping and low-level agents' actions.<n>We incorporate permutation-in neural networks to handle the homogeneity and cooperation among agents, enabling effective coordination.
arXiv Detail & Related papers (2025-01-11T14:22:10Z) - Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.<n>We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - Hierarchical Multi-Agent DRL Based Dynamic Cluster Reconfiguration for UAV Mobility Management [46.80160709931929]
Multi-connectivity involves dynamic cluster formation among distributed access points (APs) and coordinated resource allocation from these APs.<n>We propose a novel mobility management scheme for unmanned aerial vehicles (UAVs) that uses dynamic cluster reconfiguration with energy-efficient power allocation.
arXiv Detail & Related papers (2024-12-05T19:20:42Z) - Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards [1.179778723980276]
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for sequential decision-making and control tasks.
The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals.
We propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies.
arXiv Detail & Related papers (2024-08-12T21:38:40Z) - Quantifying Agent Interaction in Multi-agent Reinforcement Learning for
Cost-efficient Generalization [63.554226552130054]
Generalization poses a significant challenge in Multi-agent Reinforcement Learning (MARL)
The extent to which an agent is influenced by unseen co-players depends on the agent's policy and the specific scenario.
We present the Level of Influence (LoI), a metric quantifying the interaction intensity among agents within a given scenario and environment.
arXiv Detail & Related papers (2023-10-11T06:09:26Z) - Decentralized Policy Optimization [21.59254848913971]
We propose textitdecentralized policy optimization (DPO), a decentralized actor-critic algorithm with monotonic improvement and convergence guarantee.
Empirically, we compare DPO with IPPO in a variety of cooperative multi-agent tasks, covering discrete and continuous action spaces, and fully and partially observable environments.
arXiv Detail & Related papers (2022-11-06T05:38:23Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Is Independent Learning All You Need in the StarCraft Multi-Agent
Challenge? [100.48692829396778]
Independent PPO (IPPO) is a form of independent learning in which each agent simply estimates its local value function.
IPPO's strong performance may be due to its robustness to some forms of environment non-stationarity.
arXiv Detail & Related papers (2020-11-18T20:29:59Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.