A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services
- URL: http://arxiv.org/abs/2510.11535v1
- Date: Mon, 13 Oct 2025 15:38:10 GMT
- Title: A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services
- Authors: Vincenzo Norman Vitale, Antonia Maria Tulino, Andreas F. Molisch, Jaime Llorca,
- Abstract summary: Most existing network control solutions target only average delay performance, falling short of providing strict End-to-End (E2E) peak latency guarantees.<n>This paper addresses the challenge of reliably delivering packets within application-imposed deadlines by leveraging recent advancements in Multi-Agent Deep Reinforcement Learning (MA-DRL)<n>We present a novel MA-DRL network control framework that leverages a centralized routing and distributed scheduling architecture.
- Score: 18.675072317045466
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Timely delivery of delay-sensitive information over dynamic, heterogeneous networks is increasingly essential for a range of interactive applications, such as industrial automation, self-driving vehicles, and augmented reality. However, most existing network control solutions target only average delay performance, falling short of providing strict End-to-End (E2E) peak latency guarantees. This paper addresses the challenge of reliably delivering packets within application-imposed deadlines by leveraging recent advancements in Multi-Agent Deep Reinforcement Learning (MA-DRL). After introducing the Delay-Constrained Maximum-Throughput (DCMT) dynamic network control problem, and highlighting the limitations of current solutions, we present a novel MA-DRL network control framework that leverages a centralized routing and distributed scheduling architecture. The proposed framework leverages critical networking domain knowledge for the design of effective MA-DRL strategies based on the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) technique, where centralized routing and distributed scheduling agents dynamically assign paths and schedule packet transmissions according to packet lifetimes, thereby maximizing on-time packet delivery. The generality of the proposed framework allows integrating both data-driven \blue{Deep Reinforcement Learning (DRL)} agents and traditional rule-based policies in order to strike the right balance between performance and learning complexity. Our results confirm the superiority of the proposed framework with respect to traditional stochastic optimization-based approaches and provide key insights into the role and interplay between data-driven DRL agents and new rule-based policies for both efficient and high-performance control of latency-critical services.
Related papers
- Safe Multi-Agent Deep Reinforcement Learning for Privacy-Aware Edge-Device Collaborative DNN Inference [8.14391361533752]
This paper proposes a privacy-aware collaborative inference framework, in which adaptive model partitioning is performed across edge devices and servers.<n>We formulate the joint problem as a Constrained Markov Decision Process (CMDP) that integrates model deployment, user-server association, model partitioning, and resource allocation.<n>We show that HC-MAPPO-L consistently satisfies stringent delay constraints while achieving a superior balance among energy consumption and privacy cost.
arXiv Detail & Related papers (2026-02-23T11:33:52Z) - Beamforming and Resource Allocation for Delay Minimization in RIS-Assisted OFDM Systems [38.71413228444903]
This paper investigates a joint beamforming and resource allocation problem in downlink reconfigurable intelligent surface (RIS)-assisted OFDM systems.<n>To effectively handle the mixed action space and reduce the state space dimensionality, a hybrid deep reinforcement learning (DRL) approach is proposed.<n>The proposed algorithm significantly reduces the average delay, enhances resource allocation efficiency, and achieves superior system robustness and fairness.
arXiv Detail & Related papers (2025-06-04T05:33:33Z) - Adaptive routing protocols for determining optimal paths in AI multi-agent systems: a priority- and learning-enhanced approach [0.0]
This paper introduces an enhanced, adaptive routing tailored for AI multi-agent networks.<n>We incorporate multi-faceted parameters such as task complexity, user request priority, agent capabilities, bandwidth, latency, load, model sophistication, and reliability.
arXiv Detail & Related papers (2025-03-10T13:16:54Z) - Design Optimization of NOMA Aided Multi-STAR-RIS for Indoor Environments: A Convex Approximation Imitated Reinforcement Learning Approach [51.63921041249406]
Non-orthogonal multiple access (NOMA) enables multiple users to share the same frequency band, and simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS)
deploying STAR-RIS indoors presents challenges in interference mitigation, power consumption, and real-time configuration.
A novel network architecture utilizing multiple access points (APs), STAR-RISs, and NOMA is proposed for indoor communication.
arXiv Detail & Related papers (2024-06-19T07:17:04Z) - Intent-Aware DRL-Based NOMA Uplink Dynamic Scheduler for IIoT [30.146175299047325]
We investigate the problem of supporting Industrial Internet of Things user equipment (IIoT UEs) with intent (i.e., requested quality of service (QoS)) and random traffic arrival.<n>A deep reinforcement learning (DRL) based centralized dynamic scheduler for time-frequency resources is proposed to learn how to schedule the available communication resources.
arXiv Detail & Related papers (2024-03-27T08:57:15Z) - Constrained Reinforcement Learning for Adaptive Controller Synchronization in Distributed SDN [7.277944770202078]
This work focuses on examining deep reinforcement learning (DRL) techniques, encompassing both value-based and policy-based methods, to guarantee an upper latency threshold for AR/VR task offloading.
Our evaluation results indicate that while value-based methods excel in optimizing individual network metrics such as latency or load balancing, policy-based approaches exhibit greater robustness in adapting to sudden network changes or reconfiguration.
arXiv Detail & Related papers (2024-01-21T21:57:22Z) - Guaranteed Dynamic Scheduling of Ultra-Reliable Low-Latency Traffic via
Conformal Prediction [72.59079526765487]
The dynamic scheduling of ultra-reliable and low-latency traffic (URLLC) in the uplink can significantly enhance the efficiency of coexisting services.
The main challenge is posed by the uncertainty in the process of URLLC packet generation.
We introduce a novel scheduler for URLLC packets that provides formal guarantees on reliability and latency irrespective of the quality of the URLLC traffic predictor.
arXiv Detail & Related papers (2023-02-15T14:09:55Z) - Distributed-Training-and-Execution Multi-Agent Reinforcement Learning
for Power Control in HetNet [48.96004919910818]
We propose a multi-agent deep reinforcement learning (MADRL) based power control scheme for the HetNet.
To promote cooperation among agents, we develop a penalty-based Q learning (PQL) algorithm for MADRL systems.
In this way, an agent's policy can be learned by other agents more easily, resulting in a more efficient collaboration process.
arXiv Detail & Related papers (2022-12-15T17:01:56Z) - Artificial Intelligence Empowered Multiple Access for Ultra Reliable and
Low Latency THz Wireless Networks [76.89730672544216]
Terahertz (THz) wireless networks are expected to catalyze the beyond fifth generation (B5G) era.
To satisfy the ultra-reliability and low-latency demands of several B5G applications, novel mobility management approaches are required.
This article presents a holistic MAC layer approach that enables intelligent user association and resource allocation, as well as flexible and adaptive mobility management.
arXiv Detail & Related papers (2022-08-17T03:00:24Z) - Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless
Cellular Networks [82.02891936174221]
Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach.
In this paper, a novel semantic-aware CDRL method is proposed to enable a group of untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network.
arXiv Detail & Related papers (2021-11-23T18:24:47Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.