Arrival Control in Quasi-Reversible Queueing Systems: Optimization and Reinforcement Learning
- URL: http://arxiv.org/abs/2505.16353v2
- Date: Wed, 11 Jun 2025 05:41:33 GMT
- Title: Arrival Control in Quasi-Reversible Queueing Systems: Optimization and Reinforcement Learning
- Authors: Céline Comte, Pascal Moyal,
- Abstract summary: We introduce a versatile scheme for optimizing the arrival rates of quasi-reversible queueing systems.<n>We prove that supplementing a quasi-reversible queueing system with a balanced arrival-control policy preserves the quasi-reversibility.<n>We focus on the problem of admission control and leverage our results in the frameworks of optimization and reinforcement learning.
- Score: 0.4143603294943439
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce a versatile scheme for optimizing the arrival rates of quasi-reversible queueing systems. We first propose an alternative definition of quasi-reversibility that encompasses reversibility and highlights the importance of the definition of customer classes. In a second time, we introduce balanced arrival control policies, which generalize the notion of balanced arrival rates introduced in the context of Whittle networks, to the much broader class of quasi-reversible queueing systems. We prove that supplementing a quasi-reversible queueing system with a balanced arrival-control policy preserves the quasi-reversibility, and we specify the form of the stationary measures. We revisit two canonical examples of quasi-reversible queueing systems, Whittle networks and order-independent queues. Lastly, we focus on the problem of admission control and leverage our results in the frameworks of optimization and reinforcement learning.
Related papers
- Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric.
We propose a single framework built on their equivalence in learning scenarios.
Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z) - Rethinking Resource Management in Edge Learning: A Joint Pre-training and Fine-tuning Design Paradigm [87.47506806135746]
In some applications, edge learning is experiencing a shift in focusing from conventional learning from scratch to new two-stage learning.
This paper considers the problem of joint communication and computation resource management in a two-stage edge learning system.
It is shown that the proposed joint resource management over the pre-training and fine-tuning stages well balances the system performance trade-off.
arXiv Detail & Related papers (2024-04-01T00:21:11Z) - Approximate Control for Continuous-Time POMDPs [35.26411026381803]
This work proposes a decision-making framework for partially observable systems in continuous time with discrete state and action spaces.
We employ approximation methods for the filtering and the control problem that scale well with an increasing number of states.
We demonstrate the effectiveness of our approach on several partially observed systems, including queueing systems and chemical reaction networks.
arXiv Detail & Related papers (2024-02-02T14:20:04Z) - The Transient Cost of Learning in Queueing Systems [4.257386020315728]
Queueing systems are widely applicable models with use cases in communication networks, healthcare, service systems, etc.<n>We propose the Transient Cost of Learning in Queueing (TCLQ) to quantifies the maximum increase in time-averaged queue length caused by parameter uncertainty.<n>In establishing our results, we propose a unified analysis framework for TCLQ that bridges Lyapunov and bandit analysis, provides guarantees for a wide range of algorithms, and could be of independent interest.
arXiv Detail & Related papers (2023-08-15T14:50:12Z) - Queue Scheduling with Adversarial Bandit Learning [20.606599586119835]
We consider a one-hop single-server queueing system consisting of $K$ queues, each with time-varying and non-stationary arrival and service rates.
Our scheduling approach builds on an innovative combination of adversarial bandit learning and Lyapunov drift minimization.
We present two novel algorithms capable of stabilizing systems that can be stablized by some (possibly unknown) sequence of randomized policies.
arXiv Detail & Related papers (2023-03-03T07:17:09Z) - Learning Mean-Field Control for Delayed Information Load Balancing in
Large Queuing Systems [26.405495663998828]
In this work, we consider a multi-agent load balancing system, with delayed information, consisting of many clients (load balancers) and many parallel queues.
We apply policy gradient reinforcement learning algorithms to find an optimal load balancing solution.
Our approach is scalable but also shows good performance when compared to the state-of-the-art power-of-d variant of the Join-the-Shortest-Queue (JSQ)
arXiv Detail & Related papers (2022-08-09T13:47:19Z) - Sparsity in Partially Controllable Linear Systems [56.142264865866636]
We study partially controllable linear dynamical systems specified by an underlying sparsity pattern.
Our results characterize those state variables which are irrelevant for optimal control.
arXiv Detail & Related papers (2021-10-12T16:41:47Z) - Better than the Best: Gradient-based Improper Reinforcement Learning for
Network Scheduling [60.48359567964899]
We consider the problem of scheduling in constrained queueing networks with a view to minimizing packet delay.
We use a policy gradient based reinforcement learning algorithm that produces a scheduler that performs better than the available atomic policies.
arXiv Detail & Related papers (2021-05-01T10:18:34Z) - Tailored Learning-Based Scheduling for Kubernetes-Oriented Edge-Cloud
System [54.588242387136376]
We introduce KaiS, a learning-based scheduling framework for edge-cloud systems.
First, we design a coordinated multi-agent actor-critic algorithm to cater to decentralized request dispatch.
Second, for diverse system scales and structures, we use graph neural networks to embed system state information.
Third, we adopt a two-time-scale scheduling mechanism to harmonize request dispatch and service orchestration.
arXiv Detail & Related papers (2021-01-17T03:45:25Z) - Queue-Learning: A Reinforcement Learning Approach for Providing Quality
of Service [1.8477401359673706]
Servicerate control is a common mechanism for providing guarantees in service systems.
In this paper, we introduce a reinforcement learning-based (RL-based) service-rate controller.
Our controller provides explicit probabilistic guarantees on the end-to-end delay of the system.
arXiv Detail & Related papers (2021-01-12T17:28:57Z) - Iterative Amortized Policy Optimization [147.63129234446197]
Policy networks are a central feature of deep reinforcement learning (RL) algorithms for continuous control.
From the variational inference perspective, policy networks are a form of textitamortized optimization, optimizing network parameters rather than the policy distributions directly.
We demonstrate that iterative amortized policy optimization, yields performance improvements over direct amortization on benchmark continuous control tasks.
arXiv Detail & Related papers (2020-10-20T23:25:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.