Learning Distributed and Fair Policies for Network Load Balancing as
Markov Potentia Game
- URL: http://arxiv.org/abs/2206.01451v1
- Date: Fri, 3 Jun 2022 08:29:02 GMT
- Title: Learning Distributed and Fair Policies for Network Load Balancing as
Markov Potentia Game
- Authors: Zhiyuan Yao, Zihan Ding
- Abstract summary: This paper investigates the network load balancing problem in data centers (DCs) where multiple load balancers (LBs) are deployed.
The challenges of this problem consist of the heterogeneous processing architecture and dynamic environments.
We formulate the multi-agent load balancing problem as a Markov potential game, with a carefully and properly designed workload distribution fairness as the potential function.
A fully distributed MARL algorithm is proposed to approximate the Nash equilibrium of the game.
- Score: 4.892398873024191
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the network load balancing problem in data centers
(DCs) where multiple load balancers (LBs) are deployed, using the multi-agent
reinforcement learning (MARL) framework. The challenges of this problem consist
of the heterogeneous processing architecture and dynamic environments, as well
as limited and partial observability of each LB agent in distributed networking
systems, which can largely degrade the performance of in-production load
balancing algorithms in real-world setups.
Centralised-training-decentralised-execution (CTDE) RL scheme has been proposed
to improve MARL performance, yet it incurs -- especially in distributed
networking systems, which prefer distributed and plug-and-play design scheme --
additional communication and management overhead among agents. We formulate the
multi-agent load balancing problem as a Markov potential game, with a carefully
and properly designed workload distribution fairness as the potential function.
A fully distributed MARL algorithm is proposed to approximate the Nash
equilibrium of the game. Experimental evaluations involve both an event-driven
simulator and real-world system, where the proposed MARL load balancing
algorithm shows close-to-optimal performance in simulations, and superior
results over in-production LBs in the real-world system.
Related papers
- Scalable spectral representations for network multiagent control [53.631272539560435]
A popular model for multi-agent control, Network Markov Decision Processes (MDPs) pose a significant challenge to efficient learning.
We first derive scalable spectral local representations for network MDPs, which induces a network linear subspace for the local $Q$-function of each agent.
We design a scalable algorithmic framework for continuous state-action network MDPs, and provide end-to-end guarantees for the convergence of our algorithm.
arXiv Detail & Related papers (2024-10-22T17:45:45Z) - Load Balancing in Federated Learning [3.2999744336237384]
Federated Learning (FL) is a decentralized machine learning framework that enables learning from data distributed across multiple remote devices.
This paper proposes a load metric for scheduling policies based on the Age of Information.
We establish the optimal parameters of the Markov chain model and validate our approach through simulations.
arXiv Detail & Related papers (2024-08-01T00:56:36Z) - Multi-agent Attention Actor-Critic Algorithm for Load Balancing in
Cellular Networks [33.72503214603868]
In cellular networks, User Equipment (UE) handoff from one Base Station to another, giving rise to the load balancing problem among the BSs.
This paper formulates the load balancing problem as a Markov game and proposes a Robust Multi-agent Attention Actor-Critic (Robust-MA3C) algorithm.
arXiv Detail & Related papers (2023-03-14T15:51:33Z) - Multi-Resource Allocation for On-Device Distributed Federated Learning
Systems [79.02994855744848]
This work poses a distributed multi-resource allocation scheme for minimizing the weighted sum of latency and energy consumption in the on-device distributed federated learning (FL) system.
Each mobile device in the system engages the model training process within the specified area and allocates its computation and communication resources for deriving and uploading parameters, respectively.
arXiv Detail & Related papers (2022-11-01T14:16:05Z) - Collaborative Intelligent Reflecting Surface Networks with Multi-Agent
Reinforcement Learning [63.83425382922157]
Intelligent reflecting surface (IRS) is envisioned to be widely applied in future wireless networks.
In this paper, we investigate a multi-user communication system assisted by cooperative IRS devices with the capability of energy harvesting.
arXiv Detail & Related papers (2022-03-26T20:37:14Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - Multi-Agent Reinforcement Learning for Network Load Balancing in Data
Center [4.141301293112916]
This paper presents the network load balancing problem, a challenging real-world task for reinforcement learning methods.
The cooperative network load balancing task is formulated as a Dec-POMDP problem, which naturally induces the MARL methods.
To bridge the reality gap for applying learning-based methods, all methods are directly trained and evaluated on an emulation system.
arXiv Detail & Related papers (2022-01-27T18:47:59Z) - Reinforced Workload Distribution Fairness [3.7384509727711923]
This paper proposes a distributed reinforcement learning mechanism to-with no active load balancer state monitoring and limited network observations-improve the fairness of the workload distribution achieved by a load balancer.
Preliminary results show promise in RLbased load balancing algorithms, and identify additional challenges and future research directions.
arXiv Detail & Related papers (2021-10-29T07:51:26Z) - Towards Intelligent Load Balancing in Data Centers [0.5505634045241288]
This paper proposes Aquarius to bridge the gap between machine learning and networking systems.
It demonstrates its ability of conducting both offline data analysis and online model deployment in realistic systems.
arXiv Detail & Related papers (2021-10-27T12:47:30Z) - Adaptive Stochastic ADMM for Decentralized Reinforcement Learning in
Edge Industrial IoT [106.83952081124195]
Reinforcement learning (RL) has been widely investigated and shown to be a promising solution for decision-making and optimal control processes.
We propose an adaptive ADMM (asI-ADMM) algorithm and apply it to decentralized RL with edge-computing-empowered IIoT networks.
Experiment results show that our proposed algorithms outperform the state of the art in terms of communication costs and scalability, and can well adapt to complex IoT environments.
arXiv Detail & Related papers (2021-06-30T16:49:07Z) - Information Freshness-Aware Task Offloading in Air-Ground Integrated
Edge Computing Systems [49.80033982995667]
This paper studies the problem of information freshness-aware task offloading in an air-ground integrated multi-access edge computing system.
A third-party real-time application service provider provides computing services to the subscribed mobile users (MUs) with the limited communication and computation resources from the InP.
We derive a novel deep reinforcement learning (RL) scheme that adopts two separate double deep Q-networks for each MU to approximate the Q-factor and the post-decision Q-factor.
arXiv Detail & Related papers (2020-07-15T21:32:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.