Related papers: Asynchronous MultiAgent Reinforcement Learning for 5G Routing under Side Constraints

Asynchronous MultiAgent Reinforcement Learning for 5G Routing under Side Constraints

URL: http://arxiv.org/abs/2602.00035v1
Date: Sun, 18 Jan 2026 18:38:37 GMT
Title: Asynchronous MultiAgent Reinforcement Learning for 5G Routing under Side Constraints
Authors: Sebastian Racedo, Brigitte Jaumard, Oscar Delgado, Meysam Masoudi,
Abstract summary: We propose an asynchronous multi-agent reinforcement learning framework in which independent PPO agents plan routes in parallel and commit resource deltas to a shared global resource environment.<n>We evaluate the method on an O-RAN like network simulation using nearly real-time traffic data from the city of Montreal.<n>AMARL achieves a similar Grade of Service (GoS) and end-to-end latency, with reduced training wall-clock time and improved robustness to demand shifts.
Score: 1.0732935873226022
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Networks in the current 5G and beyond systems increasingly carry heterogeneous traffic with diverse quality-of-service constraints, making real-time routing decisions both complex and time-critical. A common approach, such as a heuristic with human intervention or training a single centralized RL policy or synchronizing updates across multiple learners, struggles with scalability and straggler effects. We address this by proposing an asynchronous multi-agent reinforcement learning (AMARL) framework in which independent PPO agents, one per service, plan routes in parallel and commit resource deltas to a shared global resource environment. This coordination by state preserves feasibility across services and enables specialization for service-specific objectives. We evaluate the method on an O-RAN like network simulation using nearly real-time traffic data from the city of Montreal. We compared against a single-agent PPO baseline. AMARL achieves a similar Grade of Service (acceptance rate) (GoS) and end-to-end latency, with reduced training wall-clock time and improved robustness to demand shifts. These results suggest that asynchronous, service-specialized agents provide a scalable and practical approach to distributed routing, with applicability extending beyond the O-RAN domain.

Related papers

Astraea: A State-Aware Scheduling Engine for LLM-Powered Agents [12.884297990127985]
Astraea is a service engine designed to shift the optimization from local segments to the global request lifecycle.<n>It employs a state-aware, hierarchical scheduling algorithm that integrates a request's historical state with future predictions.<n>Astraea reduces average JCT by up to 25.5% compared to baseline methods.
arXiv Detail & Related papers (2025-12-16T06:55:10Z)
A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services [18.675072317045466]
Most existing network control solutions target only average delay performance, falling short of providing strict End-to-End (E2E) peak latency guarantees.<n>This paper addresses the challenge of reliably delivering packets within application-imposed deadlines by leveraging recent advancements in Multi-Agent Deep Reinforcement Learning (MA-DRL)<n>We present a novel MA-DRL network control framework that leverages a centralized routing and distributed scheduling architecture.
arXiv Detail & Related papers (2025-10-13T15:38:10Z)
AgentRouter: A Knowledge-Graph-Guided LLM Router for Collaborative Multi-Agent Question Answering [51.07491603393163]
tAgent is a framework that formulates multi-agent QA as a knowledge-graph-guided routing problem supervised by empirical performance signals.<n>By leveraging soft supervision and weighted aggregation of agent outputs, Agent learns principled collaboration schemes that capture the complementary strengths of diverse agents.
arXiv Detail & Related papers (2025-10-06T23:20:49Z)
Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms [53.75036695728983]
Vehicle Routing Problems (VRP) are a fundamental NP-hard challenge in Evolutionary optimization.<n>We introduce an optimization framework where a reinforcement learning agent is trained on prior instances and quickly generates initial solutions.<n>This framework consistently outperforms current state-of-the-art solvers across various time budgets.
arXiv Detail & Related papers (2025-04-08T15:21:01Z)
Toward Dependency Dynamics in Multi-Agent Reinforcement Learning for Traffic Signal Control [8.312659530314937]
Reinforcement learning (RL) emerges as a promising data-driven approach for adaptive traffic signal control.<n>In this paper, we propose a novel Dynamic Reinforcement Update Strategy for Deep Q-Network (DQN-DPUS)<n>We show that the proposed strategy can speed up the convergence rate without sacrificing optimal exploration.
arXiv Detail & Related papers (2025-02-23T15:29:12Z)
Optimization of Image Transmission in a Cooperative Semantic Communication Networks [68.2233384648671]
A semantic communication framework for image transmission is developed. Servers cooperatively transmit images to a set of users utilizing semantic communication techniques. A multimodal metric is proposed to measure the correlation between the extracted semantic information and the original image.
arXiv Detail & Related papers (2023-01-01T15:59:13Z)
BSAC-CoEx: Coexistence of URLLC and Distributed Learning Services via Device Selection [46.59702442756128]
High-priority ultra-reliable low latency communication (URLLC) and low-priority distributed learning services run concurrently over a network.<n>We formulate this problem as a Markov decision process and address it via BSAC-CoEx, a framework based on the branching soft actor-critic (BSAC) algorithm.<n>Our solution can significantly decrease the training delays of the distributed learning service while keeping the URLLC availability above its required threshold.
arXiv Detail & Related papers (2022-12-22T15:36:15Z)
Differentiated Federated Reinforcement Learning Based Traffic Offloading on Space-Air-Ground Integrated Networks [12.080548048901374]
This paper proposes the use of differentiated federated reinforcement learning (DFRL) to solve the traffic offloading problem in SAGIN. Considering the differentiated characteristics of each region of SAGIN, DFRL models the traffic offloading policy optimization process. The paper proposes a novel Differentiated Federated Soft Actor-Critic (DFSAC) algorithm to solve the problem.
arXiv Detail & Related papers (2022-12-05T07:40:29Z)
Artificial Intelligence Empowered Multiple Access for Ultra Reliable and Low Latency THz Wireless Networks [76.89730672544216]
Terahertz (THz) wireless networks are expected to catalyze the beyond fifth generation (B5G) era. To satisfy the ultra-reliability and low-latency demands of several B5G applications, novel mobility management approaches are required. This article presents a holistic MAC layer approach that enables intelligent user association and resource allocation, as well as flexible and adaptive mobility management.
arXiv Detail & Related papers (2022-08-17T03:00:24Z)
Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless Cellular Networks [82.02891936174221]
Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach. In this paper, a novel semantic-aware CDRL method is proposed to enable a group of untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network.
arXiv Detail & Related papers (2021-11-23T18:24:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.