Multi-Agent Reinforcement Learning for Network Load Balancing in Data
Center
- URL: http://arxiv.org/abs/2201.11727v2
- Date: Fri, 28 Jan 2022 19:50:54 GMT
- Title: Multi-Agent Reinforcement Learning for Network Load Balancing in Data
Center
- Authors: Zhiyuan Yao, Zihan Ding, Thomas Clausen
- Abstract summary: This paper presents the network load balancing problem, a challenging real-world task for reinforcement learning methods.
The cooperative network load balancing task is formulated as a Dec-POMDP problem, which naturally induces the MARL methods.
To bridge the reality gap for applying learning-based methods, all methods are directly trained and evaluated on an emulation system.
- Score: 4.141301293112916
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: This paper presents the network load balancing problem, a challenging
real-world task for multi-agent reinforcement learning (MARL) methods.
Traditional heuristic solutions like Weighted-Cost Multi-Path (WCMP) and Local
Shortest Queue (LSQ) are less flexible to the changing workload distributions
and arrival rates, with a poor balance among multiple load balancers. The
cooperative network load balancing task is formulated as a Dec-POMDP problem,
which naturally induces the MARL methods. To bridge the reality gap for
applying learning-based methods, all methods are directly trained and evaluated
on an emulation system from moderate-to large-scale. Experiments on realistic
testbeds show that the independent and "selfish" load balancing strategies are
not necessarily the globally optimal ones, while the proposed MARL solution has
a superior performance over different realistic settings. Additionally, the
potential difficulties of MARL methods for network load balancing are analysed,
which helps to draw the attention of the learning and network communities to
such challenges.
Related papers
- Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training [38.03693752287459]
Multi-agent Reinforcement Learning (MARL) relies on neural networks with numerous parameters in multi-agent scenarios.
This paper proposes the utilization of dynamic sparse training (DST), a technique proven effective in deep supervised learning tasks.
We introduce an innovative Multi-Agent Sparse Training (MAST) framework aimed at simultaneously enhancing the reliability of learning targets and the rationality of sample distribution.
arXiv Detail & Related papers (2024-09-28T15:57:24Z) - Load Balancing in Federated Learning [3.2999744336237384]
Federated Learning (FL) is a decentralized machine learning framework that enables learning from data distributed across multiple remote devices.
This paper proposes a load metric for scheduling policies based on the Age of Information.
We establish the optimal parameters of the Markov chain model and validate our approach through simulations.
arXiv Detail & Related papers (2024-08-01T00:56:36Z) - Sparse Mean Field Load Balancing in Large Localized Queueing Systems [30.672653758080568]
We learn a near-optimal load balancing policy in sparsely connected queueing networks in a tractable manner.
By formulating a novel mean field control problem in the context of with bounded degree, we reduce the otherwise difficult multi-agent problem to a single-agent problem.
Empirically, the proposed methodology performs well on several realistic and scalable wireless network topologies.
arXiv Detail & Related papers (2023-12-20T12:31:28Z) - Decentralized Online Learning in Task Assignment Games for Mobile
Crowdsensing [55.07662765269297]
A mobile crowdsensing platform (MCSP) sequentially publishes sensing tasks to the available mobile units (MUs) that signal their willingness to participate in a task by sending sensing offers back to the MCSP.
A stable task assignment must address two challenges: the MCSP's and MUs' conflicting goals, and the uncertainty about the MUs' required efforts and preferences.
To overcome these challenges a novel decentralized approach combining matching theory and online learning, called collision-avoidance multi-armed bandit with strategic free sensing (CA-MAB-SFS) is proposed.
arXiv Detail & Related papers (2023-09-19T13:07:15Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Learning Distributed and Fair Policies for Network Load Balancing as
Markov Potentia Game [4.892398873024191]
This paper investigates the network load balancing problem in data centers (DCs) where multiple load balancers (LBs) are deployed.
The challenges of this problem consist of the heterogeneous processing architecture and dynamic environments.
We formulate the multi-agent load balancing problem as a Markov potential game, with a carefully and properly designed workload distribution fairness as the potential function.
A fully distributed MARL algorithm is proposed to approximate the Nash equilibrium of the game.
arXiv Detail & Related papers (2022-06-03T08:29:02Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Towards Intelligent Load Balancing in Data Centers [0.5505634045241288]
This paper proposes Aquarius to bridge the gap between machine learning and networking systems.
It demonstrates its ability of conducting both offline data analysis and online model deployment in realistic systems.
arXiv Detail & Related papers (2021-10-27T12:47:30Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Low-Latency Federated Learning over Wireless Channels with Differential
Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server.
In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z) - Optimization-driven Machine Learning for Intelligent Reflecting Surfaces
Assisted Wireless Networks [82.33619654835348]
Intelligent surface (IRS) has been employed to reshape the wireless channels by controlling individual scattering elements' phase shifts.
Due to the large size of scattering elements, the passive beamforming is typically challenged by the high computational complexity.
In this article, we focus on machine learning (ML) approaches for performance in IRS-assisted wireless networks.
arXiv Detail & Related papers (2020-08-29T08:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.