Related papers: Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning

Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning

URL: http://arxiv.org/abs/2405.12236v1
Date: Wed, 15 May 2024 23:44:06 GMT
Title: Fully Distributed Fog Load Balancing with Multi-Agent Reinforcement Learning
Authors: Maad Ebrahim, Abdelhakim Hafid,
Abstract summary: This paper proposes a fully distributed load-balancing solution with Multi-Agent Reinforcement Learning (MARL) MARL agents use transfer learning for life-long self-adaptation to dynamic changes in the environment. We analyze the impact of a realistic frequency to observe the state of the environment, unlike the unrealistic common assumption in the literature of having observations readily available in real-time for every required action.
Score: 1.9643748953805935
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-time Internet of Things (IoT) applications require real-time support to handle the ever-growing demand for computing resources to process IoT workloads. Fog Computing provides high availability of such resources in a distributed manner. However, these resources must be efficiently managed to distribute unpredictable traffic demands among heterogeneous Fog resources. This paper proposes a fully distributed load-balancing solution with Multi-Agent Reinforcement Learning (MARL) that intelligently distributes IoT workloads to optimize the waiting time while providing fair resource utilization in the Fog network. These agents use transfer learning for life-long self-adaptation to dynamic changes in the environment. By leveraging distributed decision-making, MARL agents effectively minimize the waiting time compared to a single centralized agent solution and other baselines, enhancing end-to-end execution delay. Besides performance gain, a fully distributed solution allows for a global-scale implementation where agents can work independently in small collaboration regions, leveraging nearby local resources. Furthermore, we analyze the impact of a realistic frequency to observe the state of the environment, unlike the unrealistic common assumption in the literature of having observations readily available in real-time for every required action. The findings highlight the trade-off between realism and performance using an interval-based Gossip-based multi-casting protocol against assuming real-time observation availability for every generated workload.

Related papers

Load Balancing in Federated Learning [3.2999744336237384]
Federated Learning (FL) is a decentralized machine learning framework that enables learning from data distributed across multiple remote devices. This paper proposes a load metric for scheduling policies based on the Age of Information. We establish the optimal parameters of the Markov chain model and validate our approach through simulations.
arXiv Detail & Related papers (2024-08-01T00:56:36Z)
Communication-Efficient Training Workload Balancing for Decentralized Multi-Agent Learning [20.683081355473664]
Decentralized Multi-agent Learning (DML) enables collaborative model training while preserving data privacy. ComDML balances workload among agents through a decentralized approach. ComDML can significantly reduce the overall training time while maintaining model accuracy, compared to state-of-the-art methods.
arXiv Detail & Related papers (2024-05-01T20:03:37Z)
Lifelong Learning for Fog Load Balancing: A Transfer Learning Approach [0.7366405857677226]
We improve the performance of privacy-aware Reinforcement Learning (RL) agents that optimize the execution delay of IoT applications by minimizing the waiting delay. We propose a lifelong learning framework for these agents, where lightweight inference models are used during deployment to minimize action delay and only retrained in case of significant environmental changes.
arXiv Detail & Related papers (2023-10-08T14:49:33Z)
Adaptive Resource Allocation for Virtualized Base Stations in O-RAN with Online Learning [60.17407932691429]
Open Radio Access Network systems, with their base stations (vBSs), offer operators the benefits of increased flexibility, reduced costs, vendor diversity, and interoperability. We propose an online learning algorithm that balances the effective throughput and vBS energy consumption, even under unforeseeable and "challenging'' environments. We prove the proposed solutions achieve sub-linear regret, providing zero average optimality gap even in challenging environments.
arXiv Detail & Related papers (2023-09-04T17:30:21Z)
Optimization of Image Transmission in a Cooperative Semantic Communication Networks [68.2233384648671]
A semantic communication framework for image transmission is developed. Servers cooperatively transmit images to a set of users utilizing semantic communication techniques. A multimodal metric is proposed to measure the correlation between the extracted semantic information and the original image.
arXiv Detail & Related papers (2023-01-01T15:59:13Z)
Locality Matters: A Scalable Value Decomposition Approach for Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents. We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z)
Fast-Convergent Dynamics for Distributed Resource Allocation Over Sparse Time-Varying Networks [8.830479021890577]
In this paper, distributed dynamics are deployed to solve resource allocation over time-varying multi-agent networks. The state of each agent represents the amount of resources used/produced at that agent while the total amount of resources is fixed. This is motivated by distributed applications such as in mobile edge-computing, economic dispatch over smart grids, and multi-agent coverage control.
arXiv Detail & Related papers (2020-12-15T09:57:54Z)
Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML. We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML. Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z)
Pollux: Co-adaptive Cluster Scheduling for Goodput-Optimized Deep Learning [61.29990368322931]
Pollux improves scheduling performance in deep learning (DL) clusters by adaptively co-optimizing inter-dependent factors. Pollux reduces average job completion times by 37-50% relative to state-of-the-art DL schedulers.
arXiv Detail & Related papers (2020-08-27T16:56:48Z)
Distributed Resource Scheduling for Large-Scale MEC Systems: A Multi-Agent Ensemble Deep Reinforcement Learning with Imitation Acceleration [44.40722828581203]
We propose a distributed intelligent resource scheduling (DIRS) framework, which includes centralized training relying on the global information and distributed decision making by each agent deployed in each MEC server. We first introduce a novel multi-agent ensemble-assisted distributed deep reinforcement learning (DRL) architecture, which can simplify the overall neural network structure of each agent. Secondly, we apply action refinement to enhance the exploration ability of the proposed DIRS framework, where the near-optimal state-action pairs are obtained by a novel L'evy flight search.
arXiv Detail & Related papers (2020-05-21T20:04:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.