Finite-Time Information-Theoretic Bounds in Queueing Control
- URL: http://arxiv.org/abs/2506.18278v1
- Date: Mon, 23 Jun 2025 04:14:40 GMT
- Title: Finite-Time Information-Theoretic Bounds in Queueing Control
- Authors: Yujie Liu, Vincent Y. F. Tan, Yunbei Xu,
- Abstract summary: We derive new policies that achieve them-for the total queue length in scheduling problems over processing networks with both adversarial and arrivals.<n>These findings reveal a fundamental limitation on "drift-only" methods and point the way toward principled, non-asymptotic optimality in queueing control.
- Score: 54.11376591632282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We establish the first finite-time information-theoretic lower bounds-and derive new policies that achieve them-for the total queue length in scheduling problems over stochastic processing networks with both adversarial and stochastic arrivals. Prior analyses of MaxWeight guarantee only stability and asymptotic optimality in heavy traffic; we prove that, at finite horizons, MaxWeight can incur strictly larger backlog by problem-dependent factors which we identify. Our main innovations are 1) a minimax framework that pinpoints the precise problem parameters governing any policy's finite-time performance; 2) an information-theoretic lower bound on total queue length; 3) fundamental limitation of MaxWeight that it is suboptimal in finite time; and 4) a new scheduling rule that minimizes the full Lyapunov drift-including its second-order term-thereby matching the lower bound under certain conditions, up to universal constants. These findings reveal a fundamental limitation on "drift-only" methods and points the way toward principled, non-asymptotic optimality in queueing control.
Related papers
- Nonasymptotic CLT and Error Bounds for Two-Time-Scale Stochastic Approximation [12.69327994479157]
We consider linear two-time-scale approximation algorithms driven by martingale noise.<n>We derive the first non-asymptotic central limit theorem with respect to the Wasserstein-1 distance for two-time-scale approximation with PolyakRuppert averaging.
arXiv Detail & Related papers (2025-02-14T03:20:30Z) - Beyond Primal-Dual Methods in Bandits with Stochastic and Adversarial Constraints [29.514323697659613]
We address a generalization of the bandit with knapsacks problem, where a learner aims to maximize rewards while satisfying an arbitrary set of long-term constraints.
Our goal is to design best-of-both-worlds algorithms that perform under both and adversarial constraints.
arXiv Detail & Related papers (2024-05-25T08:09:36Z) - Online Non-convex Optimization with Long-term Non-convex Constraints [2.033434950296318]
A novel Follow-the-Perturbed-Leader type algorithm is proposed and analyzed for solving general long-term constrained optimization problems in online manner.
The proposed algorithm is applied to tackle a long-term (extreme value) constrained river pollutant source identification problem.
arXiv Detail & Related papers (2023-11-04T15:08:36Z) - Pessimistic Minimax Value Iteration: Provably Efficient Equilibrium
Learning from Offline Datasets [101.5329678997916]
We study episodic two-player zero-sum Markov games (MGs) in the offline setting.
The goal is to find an approximate Nash equilibrium (NE) policy pair based on a dataset collected a priori.
arXiv Detail & Related papers (2022-02-15T15:39:30Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Finite Sample Analysis of Minimax Offline Reinforcement Learning:
Completeness, Fast Rates and First-Order Efficiency [83.02999769628593]
We offer a theoretical characterization of off-policy evaluation (OPE) in reinforcement learning.
We show that the minimax approach enables us to achieve a fast rate of convergence for weights and quality functions.
We present the first finite-sample result with first-order efficiency in non-tabular environments.
arXiv Detail & Related papers (2021-02-05T03:20:39Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z) - On Lower Bounds for Standard and Robust Gaussian Process Bandit
Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm.
We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z) - Relaxing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via
Root-Entropic Regularization [16.536558038560695]
We consider prediction with expert advice when data are generated from varying arbitrarily within an unknown constraint set.
The Hedge algorithm was recently shown to be simultaneously minimax optimal for i.i.d. data.
We provide matching upper and lower bounds on the minimax regret at all levels, show that Hedge with deterministic learning rates is suboptimal outside of the extremes, and prove that one can adaptively obtain minimax regret at all levels.
arXiv Detail & Related papers (2020-07-13T17:54:34Z) - Simulating extremal temporal correlations [0.0]
correlations arising from sequential measurements on a single quantum system form a polytope.
This is defined by the arrow-of-time (AoT) constraints, meaning that future choices of measurement settings cannot influence past outcomes.
We discuss the resources needed to simulate the extreme points of the AoT polytope, where resources are quantified in terms of the minimal dimension, or "internal memory" of the physical system.
arXiv Detail & Related papers (2020-04-30T15:07:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.