Related papers: A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications

A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications

URL: http://arxiv.org/abs/2603.04353v1
Date: Wed, 04 Mar 2026 18:19:35 GMT
Title: A Constrained RL Approach for Cost-Efficient Delivery of Latency-Sensitive Applications
Authors: Ozan Aygün, Vincenzo Norman Vitale, Antonia M. Tulino, Hao Feng, Elza Erkip, Jaime Llorca,
Abstract summary: Next-generation networks aim to provide performance guarantees to real-time interactive services.<n>The goal is to reliably deliver packets with strict deadlines imposed by the application.
Score: 16.03353922224779
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Next-generation networks aim to provide performance guarantees to real-time interactive services that require timely and cost-efficient packet delivery. In this context, the goal is to reliably deliver packets with strict deadlines imposed by the application while minimizing overall resource allocation cost. A large body of work has leveraged stochastic optimization techniques to design efficient dynamic routing and scheduling solutions under average delay constraints; however, these methods fall short when faced with strict per-packet delay requirements. We formulate the minimum-cost delay-constrained network control problem as a constrained Markov decision process and utilize constrained deep reinforcement learning (CDRL) techniques to effectively minimize total resource allocation cost while maintaining timely throughput above a target reliability level. Results indicate that the proposed CDRL-based solution can ensure timely packet delivery even when existing baselines fall short, and it achieves lower cost compared to other throughput-maximizing methods.

Related papers

iScheduler: Reinforcement Learning-Driven Continual Optimization for Large-Scale Resource Investment Problems [30.109981943437006]
Scheduling precedence-constrained tasks under shared renewable resources is central to modern computing platforms.<n>We present iScheduler, a reinforcement-learning-driven iterative scheduling framework.<n>Experiments show that iScheduler attains competitive resource costs while reducing time to feasibility by up to 43$times$ against strong commercial baselines.
arXiv Detail & Related papers (2026-01-30T11:20:58Z)
A Flexible Multi-Agent Deep Reinforcement Learning Framework for Dynamic Routing and Scheduling of Latency-Critical Services [18.675072317045466]
Most existing network control solutions target only average delay performance, falling short of providing strict End-to-End (E2E) peak latency guarantees.<n>This paper addresses the challenge of reliably delivering packets within application-imposed deadlines by leveraging recent advancements in Multi-Agent Deep Reinforcement Learning (MA-DRL)<n>We present a novel MA-DRL network control framework that leverages a centralized routing and distributed scheduling architecture.
arXiv Detail & Related papers (2025-10-13T15:38:10Z)
Dynamic Speculative Agent Planning [57.630218933994534]
Large language-model-based agents face critical deployment challenges due to prohibitive latency and inference costs.<n>We introduce Dynamic Speculative Planning (DSP), an online reinforcement learning framework that provides lossless acceleration with substantially reduced costs.<n>Experiments on two standard agent benchmarks demonstrate that DSP achieves comparable efficiency to the fastest acceleration method while reducing total cost by 30% and unnecessary cost up to 60%.
arXiv Detail & Related papers (2025-09-02T03:34:36Z)
CSGO: Generalized Optimization for Cold Start in Wireless Collaborative Edge LLM Systems [62.24576366776727]
We propose a latency-aware scheduling framework to minimize total inference latency.<n>We show that the proposed method significantly reduces cold-start latency compared to baseline strategies.
arXiv Detail & Related papers (2025-08-15T07:49:22Z)
Beamforming and Resource Allocation for Delay Minimization in RIS-Assisted OFDM Systems [38.71413228444903]
This paper investigates a joint beamforming and resource allocation problem in downlink reconfigurable intelligent surface (RIS)-assisted OFDM systems.<n>To effectively handle the mixed action space and reduce the state space dimensionality, a hybrid deep reinforcement learning (DRL) approach is proposed.<n>The proposed algorithm significantly reduces the average delay, enhances resource allocation efficiency, and achieves superior system robustness and fairness.
arXiv Detail & Related papers (2025-06-04T05:33:33Z)
OmniRouter: Budget and Performance Controllable Multi-LLM Routing [31.60019342381251]
Large language models (LLMs) deliver superior performance but require substantial computational resources and operate with relatively low efficiency.<n>We introduce Omni, a controllable routing framework for multi-LLM serving.<n>Experiments show that Omni achieves up to 6.30% improvement in response accuracy while simultaneously reducing computational costs by at least 10.15%.
arXiv Detail & Related papers (2025-02-27T22:35:31Z)
Elastic Entangled Pair and Qubit Resource Management in Quantum Cloud Computing [73.7522199491117]
Quantum cloud computing (QCC) offers a promising approach to efficiently provide quantum computing resources. The fluctuations in user demand and quantum circuit requirements are challenging for efficient resource provisioning. We propose a resource allocation model to provision quantum computing and networking resources.
arXiv Detail & Related papers (2023-07-25T00:38:46Z)
Dynamic Scheduling for Federated Edge Learning with Streaming Data [56.91063444859008]
We consider a Federated Edge Learning (FEEL) system where training data are randomly generated over time at a set of distributed edge devices with long-term energy constraints. Due to limited communication resources and latency requirements, only a subset of devices is scheduled for participating in the local training process in every iteration.
arXiv Detail & Related papers (2023-05-02T07:41:16Z)
Guaranteed Dynamic Scheduling of Ultra-Reliable Low-Latency Traffic via Conformal Prediction [72.59079526765487]
The dynamic scheduling of ultra-reliable and low-latency traffic (URLLC) in the uplink can significantly enhance the efficiency of coexisting services. The main challenge is posed by the uncertainty in the process of URLLC packet generation. We introduce a novel scheduler for URLLC packets that provides formal guarantees on reliability and latency irrespective of the quality of the URLLC traffic predictor.
arXiv Detail & Related papers (2023-02-15T14:09:55Z)
Fidelity-Guarantee Entanglement Routing in Quantum Networks [64.49733801962198]
Entanglement routing establishes remote entanglement connection between two arbitrary nodes. We propose purification-enabled entanglement routing designs to provide fidelity guarantee for multiple Source-Destination (SD) pairs in quantum networks.
arXiv Detail & Related papers (2021-11-15T14:07:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.