Multi-Agent Q-Learning for Real-Time Load Balancing User Association and Handover in Mobile Networks
- URL: http://arxiv.org/abs/2412.19835v1
- Date: Sun, 22 Dec 2024 11:22:01 GMT
- Title: Multi-Agent Q-Learning for Real-Time Load Balancing User Association and Handover in Mobile Networks
- Authors: Alireza Alizadeh, Byungju Lim, Mai Vu,
- Abstract summary: We propose multi-agent online Q-learning (QL) algorithms for performing real-time load balancing user association and handover in dense cellular networks.
We propose two multi-agent action selection policies, one centralized and one distributed, to satisfy load balancing at every learning step.
We show that both policies adapt well to network dynamics at various UE speed profiles.
- Score: 16.107256745452933
- License:
- Abstract: As next generation cellular networks become denser, associating users with the optimal base stations at each time while ensuring no base station is overloaded becomes critical for achieving stable and high network performance. We propose multi-agent online Q-learning (QL) algorithms for performing real-time load balancing user association and handover in dense cellular networks. The load balancing constraints at all base stations couple the actions of user agents, and we propose two multi-agent action selection policies, one centralized and one distributed, to satisfy load balancing at every learning step. In the centralized policy, the actions of UEs are determined by a central load balancer (CLB) running an algorithm based on swapping the worst connection to maximize the total learning reward. In the distributed policy, each UE takes an action based on its local information by participating in a distributed matching game with the BSs to maximize the local reward. We then integrate these action selection policies into an online QL algorithm that adapts in real-time to network dynamics including channel variations and user mobility, using a reward function that considers a handover cost to reduce handover frequency. The proposed multi-agent QL algorithm features low-complexity and fast convergence, outperforming 3GPP max-SINR association. Both policies adapt well to network dynamics at various UE speed profiles from walking, running, to biking and suburban driving, illustrating their robustness and real-time adaptability.
Related papers
- Cluster-Based Multi-Agent Task Scheduling for Space-Air-Ground Integrated Networks [60.085771314013044]
Low-altitude economy holds significant potential for development in areas such as communication and sensing.
We propose a Clustering-based Multi-agent Deep Deterministic Policy Gradient (CMADDPG) algorithm to address the multi-UAV cooperative task scheduling challenges in SAGIN.
arXiv Detail & Related papers (2024-12-14T06:17:33Z) - Mobility-Aware Joint User Scheduling and Resource Allocation for Low
Latency Federated Learning [14.343345846105255]
We propose a practical model for user mobility in Federated learning systems.
We develop a user scheduling and resource allocation method to minimize the training delay with constrained communication resources.
Specifically, we first formulate an optimization problem with user mobility that jointly considers user selection, BS assignment to users, and bandwidth allocation.
arXiv Detail & Related papers (2023-07-18T13:48:05Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - Decentralized Federated Reinforcement Learning for User-Centric Dynamic
TFDD Control [37.54493447920386]
We propose a learning-based dynamic time-frequency division duplexing (D-TFDD) scheme to meet asymmetric and heterogeneous traffic demands.
We formulate the problem as a decentralized partially observable Markov decision process (Dec-POMDP)
In order to jointly optimize the global resources in a decentralized manner, we propose a federated reinforcement learning (RL) algorithm named Wolpertinger deep deterministic policy gradient (FWDDPG) algorithm.
arXiv Detail & Related papers (2022-11-04T07:39:21Z) - Dynamic Multichannel Access via Multi-agent Reinforcement Learning:
Throughput and Fairness Guarantees [9.615742794292943]
We propose a distributed multichannel access protocol based on multi-agent reinforcement learning (RL)
Unlike the previous approaches adjusting channel access probabilities at each time slot, the proposed RL algorithm deterministically selects a set of channel access policies for several consecutive time slots.
We perform extensive simulations on realistic traffic environments and demonstrate that the proposed online learning improves both throughput and fairness.
arXiv Detail & Related papers (2021-05-10T02:32:57Z) - A Machine Learning Approach for Task and Resource Allocation in Mobile
Edge Computing Based Networks [108.57859531628264]
A joint task, spectrum, and transmit power allocation problem is investigated for a wireless network.
The proposed algorithm can reduce the number of iterations needed for convergence and the maximal delay among all users by up to 18% and 11.1% compared to the standard Q-learning algorithm.
arXiv Detail & Related papers (2020-07-20T13:46:42Z) - Multi-Agent Routing Value Iteration Network [88.38796921838203]
We propose a graph neural network based model that is able to perform multi-agent routing based on learned value in a sparsely connected graph.
We show that our model trained with only two agents on graphs with a maximum of 25 nodes can easily generalize to situations with more agents and/or nodes.
arXiv Detail & Related papers (2020-07-09T22:16:45Z) - Federated Learning for Task and Resource Allocation in Wireless High
Altitude Balloon Networks [160.96150373385768]
The problem of minimizing energy and time consumption for task computation and transmission is studied in a mobile edge computing (MEC)-enabled balloon network.
A support vector machine (SVM)-based federated learning (FL) algorithm is proposed to determine the user association proactively.
The proposed SVM-based FL method enables each HAB to cooperatively build an SVM model that can determine all user associations.
arXiv Detail & Related papers (2020-03-19T14:18:25Z) - Multiple Access in Dynamic Cell-Free Networks: Outage Performance and
Deep Reinforcement Learning-Based Design [24.632250413917816]
In future cell-free (or cell-less) wireless networks, a large number of devices in a geographical area will be served simultaneously by a large number of distributed access points (APs)
We propose a novel dynamic cell-free network architecture to reduce the complexity of joint processing of users' signals in presence of a large number of devices and APs.
In our system setting, the proposed DDPG-DDQN scheme is found to achieve around $78%$ of the rate achievable through an exhaustive search-based design.
arXiv Detail & Related papers (2020-01-29T03:00:22Z) - Reinforcement Learning Based Vehicle-cell Association Algorithm for
Highly Mobile Millimeter Wave Communication [53.47785498477648]
This paper investigates the problem of vehicle-cell association in millimeter wave (mmWave) communication networks.
We first formulate the user state (VU) problem as a discrete non-vehicle association optimization problem.
The proposed solution achieves up to 15% gains in terms sum of user complexity and 20% reduction in VUE compared to several baseline designs.
arXiv Detail & Related papers (2020-01-22T08:51:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.