Reinforced Workload Distribution Fairness
- URL: http://arxiv.org/abs/2111.00008v1
- Date: Fri, 29 Oct 2021 07:51:26 GMT
- Title: Reinforced Workload Distribution Fairness
- Authors: Zhiyuan Yao, Zihan Ding, Thomas Heide Clausen
- Abstract summary: This paper proposes a distributed reinforcement learning mechanism to-with no active load balancer state monitoring and limited network observations-improve the fairness of the workload distribution achieved by a load balancer.
Preliminary results show promise in RLbased load balancing algorithms, and identify additional challenges and future research directions.
- Score: 3.7384509727711923
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Network load balancers are central components in data centers, that
distributes workloads across multiple servers and thereby contribute to
offering scalable services. However, when load balancers operate in dynamic
environments with limited monitoring of application server loads, they rely on
heuristic algorithms that require manual configurations for fairness and
performance. To alleviate that, this paper proposes a distributed asynchronous
reinforcement learning mechanism to-with no active load balancer state
monitoring and limited network observations-improve the fairness of the
workload distribution achieved by a load balancer. The performance of proposed
mechanism is evaluated and compared with stateof-the-art load balancing
algorithms in a simulator, under configurations with progressively increasing
complexities. Preliminary results show promise in RLbased load balancing
algorithms, and identify additional challenges and future research directions,
including reward function design and model scalability.
Related papers
- A Replicate-and-Quantize Strategy for Plug-and-Play Load Balancing of Sparse Mixture-of-Experts LLMs [64.8510381475827]
Sparse Mixture-of-Experts (SMoE) architectures are increasingly used to scale large language models efficiently.<n>SMoE models often suffer from severe load imbalance across experts, where a small subset of experts receives most tokens while others are underutilized.<n>We present a systematic analysis of expert routing during inference and identify three findings: (i) load imbalance persists and worsens with larger batch sizes, (ii) selection frequency does not reliably reflect expert importance, and (iii) overall expert workload and importance can be estimated using a small calibration set.
arXiv Detail & Related papers (2026-02-23T15:11:16Z) - Deep Time-series Forecasting Needs Kernelized Moment Balancing [56.619037429652984]
Deep time-series forecasting can be formulated as a distribution balancing problem aimed at aligning the distribution of the forecasts and ground truths.<n>We propose direct forecasting with kernelized moment balancing (KMB-DF)<n>Experiments across multiple models and datasets show that KMB-DF consistently improves forecasting accuracy and achieves state-of-the-art performance.
arXiv Detail & Related papers (2026-01-31T13:20:18Z) - Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts [74.40169987564724]
Expert parallelism (EP) is designed to scale MoE models by distributing experts across multiple devices.<n>Under extreme imbalance, EP can funnel a disproportionate number of tokens to a small number of experts, leading to compute- and memory-bound failures.<n>We propose Least-Loaded Expert Parallelism (LLEP), a novel EP algorithm that dynamically reroutes excess tokens and associated expert parameters from overloaded devices to underutilized ones.
arXiv Detail & Related papers (2026-01-23T18:19:15Z) - Coherent Load Profile Synthesis with Conditional Diffusion for LV Distribution Network Scenario Generation [1.9248772611306222]
Load profiling approaches often rely on summarising demand through typical profiles.<n>Co-behaviour between substations, which ultimately impacts higher voltage level network operation is often overlooked.<n>A Conditional Diffusion model for synthesising daily active and reactive power profiles at the low voltage distribution substation level is proposed.
arXiv Detail & Related papers (2025-10-13T08:40:39Z) - PowerGrow: Feasible Co-Growth of Structures and Dynamics for Power Grid Synthesis [75.14189839277928]
We present PowerGrow, a co-generative framework that significantly reduces computational overhead while maintaining operational validity.<n> Experiments across benchmark settings show that PowerGrow outperforms prior diffusion models in fidelity and diversity.<n>This demonstrates its ability to generate operationally valid and realistic power grid scenarios.
arXiv Detail & Related papers (2025-08-29T01:47:27Z) - CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems [62.24576366776727]
We propose a latency-aware scheduling framework to minimize total inference latency.<n>We show that the proposed method significantly reduces cold-start latency compared to baseline strategies.
arXiv Detail & Related papers (2025-08-15T07:49:22Z) - Load Balancing for AI Training Workloads [4.6874900353446325]
We investigate the performance of various load balancing algorithms for large-scale AI training workloads that are running on dedicated infrastructure.<n>The performance of load balancing depends on both the congestion control and loss recovery algorithms, so our evaluation also sheds light on the appropriate choices for those designs as well.
arXiv Detail & Related papers (2025-07-28T22:34:18Z) - Controlled Data Rebalancing in Multi-Task Learning for Real-World Image Super-Resolution [51.79973519845773]
Real-world image super-resolution (Real-SR) is a challenging problem due to the complex degradation patterns in low-resolution images.<n>We propose an improved paradigm that frames Real-SR as a data-heterogeneous multi-task learning problem.
arXiv Detail & Related papers (2025-06-05T21:40:21Z) - Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling [1.3689475854650441]
This study proposes a comprehensive scalability optimization framework for cloud AI inference services.
The proposed model is a hybrid approach that combines reinforcement learning for adaptive load distribution and deep neural networks for accurate demand forecasting.
Experimental results demonstrate that the proposed model enhances load balancing efficiency by 35 and reduces response delay by 28.
arXiv Detail & Related papers (2025-04-16T04:00:04Z) - Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Reinforcement Learning-Based Adaptive Load Balancing for Dynamic Cloud Environments [0.0]
We propose a novel adaptive load balancing framework using Reinforcement Learning (RL) to address these challenges.
Our framework is designed to dynamically reallocate tasks to minimize latency and ensure balanced resource usage across servers.
Experimental results show that the proposed RL-based load balancer outperforms traditional algorithms in terms of response time, resource utilization, and adaptability to changing workloads.
arXiv Detail & Related papers (2024-09-07T19:40:48Z) - On the Role of Server Momentum in Federated Learning [85.54616432098706]
We propose a general framework for server momentum, that (a) covers a large class of momentum schemes that are unexplored in federated learning (FL)
We provide rigorous convergence analysis for the proposed framework.
arXiv Detail & Related papers (2023-12-19T23:56:49Z) - Leveraging Low-Rank and Sparse Recurrent Connectivity for Robust
Closed-Loop Control [63.310780486820796]
We show how a parameterization of recurrent connectivity influences robustness in closed-loop settings.
We find that closed-form continuous-time neural networks (CfCs) with fewer parameters can outperform their full-rank, fully-connected counterparts.
arXiv Detail & Related papers (2023-10-05T21:44:18Z) - Vertical Layering of Quantized Neural Networks for Heterogeneous
Inference [57.42762335081385]
We study a new vertical-layered representation of neural network weights for encapsulating all quantized models into a single one.
We can theoretically achieve any precision network for on-demand service while only needing to train and maintain one model.
arXiv Detail & Related papers (2022-12-10T15:57:38Z) - Learning Mean-Field Control for Delayed Information Load Balancing in
Large Queuing Systems [26.405495663998828]
In this work, we consider a multi-agent load balancing system, with delayed information, consisting of many clients (load balancers) and many parallel queues.
We apply policy gradient reinforcement learning algorithms to find an optimal load balancing solution.
Our approach is scalable but also shows good performance when compared to the state-of-the-art power-of-d variant of the Join-the-Shortest-Queue (JSQ)
arXiv Detail & Related papers (2022-08-09T13:47:19Z) - Learning Distributed and Fair Policies for Network Load Balancing as
Markov Potentia Game [4.892398873024191]
This paper investigates the network load balancing problem in data centers (DCs) where multiple load balancers (LBs) are deployed.
The challenges of this problem consist of the heterogeneous processing architecture and dynamic environments.
We formulate the multi-agent load balancing problem as a Markov potential game, with a carefully and properly designed workload distribution fairness as the potential function.
A fully distributed MARL algorithm is proposed to approximate the Nash equilibrium of the game.
arXiv Detail & Related papers (2022-06-03T08:29:02Z) - A Dynamic Residual Self-Attention Network for Lightweight Single Image
Super-Resolution [17.094665593472214]
We propose a dynamic residual self-attention network (DRSAN) for lightweight single-image super-resolution (SISR)
DRSAN has dynamic residual connections based on dynamic residual attention (DRA), which adaptively changes its structure according to input statistics.
We also propose a residual self-attention (RSA) module to further boost the performance, which produces 3-dimensional attention maps without additional parameters.
arXiv Detail & Related papers (2021-12-08T06:41:21Z) - Attention-Based Model and Deep Reinforcement Learning for Distribution
of Event Processing Tasks [0.0]
Event processing is a cornerstone of the dynamic and responsive Internet of Things (IoT)
This article investigates the use of deep learning to fairly distribute the tasks.
An attention-based neural network model is proposed to generate efficient load balancing solutions.
arXiv Detail & Related papers (2021-12-07T17:16:35Z) - Federated Learning with Unreliable Clients: Performance Analysis and
Mechanism Design [76.29738151117583]
Federated Learning (FL) has become a promising tool for training effective machine learning models among distributed clients.
However, low quality models could be uploaded to the aggregator server by unreliable clients, leading to a degradation or even a collapse of training.
We model these unreliable behaviors of clients and propose a defensive mechanism to mitigate such a security risk.
arXiv Detail & Related papers (2021-05-10T08:02:27Z) - Probabilistic electric load forecasting through Bayesian Mixture Density
Networks [70.50488907591463]
Probabilistic load forecasting (PLF) is a key component in the extended tool-chain required for efficient management of smart energy grids.
We propose a novel PLF approach, framed on Bayesian Mixture Density Networks.
To achieve reliable and computationally scalable estimators of the posterior distributions, both Mean Field variational inference and deep ensembles are integrated.
arXiv Detail & Related papers (2020-12-23T16:21:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.