Wait-Less Offline Tuning and Re-solving for Online Decision Making
- URL: http://arxiv.org/abs/2412.09594v2
- Date: Fri, 10 Jan 2025 09:40:04 GMT
- Title: Wait-Less Offline Tuning and Re-solving for Online Decision Making
- Authors: Jingruo Sun, Wenzhi Gao, Ellen Vitercik, Yinyu Ye,
- Abstract summary: We propose a new algorithm that combines the strengths of LP-based and first-order OLP methods.
Our algorithm achieves $mathscrO(log (T/f) + sqrtf)$ regret, delivering a "wait-less" online decision-making process.
- Score: 10.984414655748095
- License:
- Abstract: Online linear programming (OLP) has found broad applications in revenue management and resource allocation. State-of-the-art OLP algorithms achieve low regret by repeatedly solving linear programming (LP) subproblems that incorporate updated resource information. However, LP-based methods are computationally expensive and often inefficient for large-scale applications. In contrast, recent first-order OLP algorithms are more computationally efficient but typically suffer from worse regret guarantees. To address these shortcomings, we propose a new algorithm that combines the strengths of LP-based and first-order OLP methods. The algorithm re-solves the LP subproblems periodically at a predefined frequency $f$ and uses the latest dual prices to guide online decision-making. In addition, a first-order method runs in parallel during each interval between LP re-solves, smoothing resource consumption. Our algorithm achieves $\mathscr{O}(\log (T/f) + \sqrt{f})$ regret, delivering a "wait-less" online decision-making process that balances the computational efficiency of first-order methods and the superior regret guarantee of LP-based methods.
Related papers
- Beyond $\mathcal{O}(\sqrt{T})$ Regret: Decoupling Learning and Decision-making in Online Linear Programming [7.518108920887426]
This paper establishes a general framework that improves upon the $mathcalO ( sqrtT )$ result when the LP dual problem exhibits certain error bound conditions.
We show that first-order learning algorithms achieve $o( sqrtT )$ regret in the continuous support setting and $mathcalO (log T)$ regret in the finite support setting beyond the non-degeneracy assumption.
arXiv Detail & Related papers (2025-01-06T04:57:44Z) - Infrequent Resolving Algorithm for Online Linear Programming [3.247415064064425]
Existing online linear programming (OLP) algorithms fall into two categories: LP-based algorithms and LP-free algorithms.
We propose a well-performing algorithm, that solves LPs at a few selected time points and conducts first-order computations at other time points.
Our work highlights the value of resolving both at the beginning and the end of the selling horizon, and provides a novel framework to prove the performance guarantee.
arXiv Detail & Related papers (2024-08-01T11:09:01Z) - Truly No-Regret Learning in Constrained MDPs [61.78619476991494]
We propose a model-based primal-dual algorithm to learn in an unknown CMDP.
We prove that our algorithm achieves sublinear regret without error cancellations.
arXiv Detail & Related papers (2024-02-24T09:47:46Z) - Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods [7.518108920887426]
We introduce a new algorithmic framework that decouples learning from decision-making.
For the first time, we show that first-order methods can attain regret $mathcalO(sqrtT)$ with this new framework.
arXiv Detail & Related papers (2024-02-11T05:35:50Z) - Efficient Methods for Non-stationary Online Learning [61.63338724659592]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
We also study an even strengthened measure, namely the interval dynamic regret'', and reduce the number of projections per round from $mathcalO(log2 T)$ to $1$.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - Learning to Reformulate for Linear Programming [11.628932152805724]
We propose a reinforcement learning-based reformulation method for linear programming (LP) to improve the performance of solving process.
We implement the proposed method over two public research LP datasets and one large-scale LP dataset collected from practical production planning scenario.
arXiv Detail & Related papers (2022-01-17T04:58:46Z) - Nearly Optimal Linear Convergence of Stochastic Primal-Dual Methods for
Linear Programming [5.126924253766052]
We show that the proposed method exhibits a linear convergence rate for solving sharp instances with a high probability.
We also propose an efficient coordinate-based oracle for unconstrained bilinear problems.
arXiv Detail & Related papers (2021-11-10T04:56:38Z) - Fast and Complete: Enabling Complete Neural Network Verification with
Rapid and Massively Parallel Incomplete Verifiers [112.23981192818721]
We propose to use backward mode linear relaxation based analysis (LiRPA) to replace Linear Programming (LP) during the BaB process.
Unlike LP, LiRPA when applied naively can produce much weaker bounds and even cannot check certain conflicts of sub-domains during splitting.
We demonstrate an order of magnitude speedup compared to existing LP-based approaches.
arXiv Detail & Related papers (2020-11-27T16:42:12Z) - Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning.
Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms.
We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z) - Sparsified Linear Programming for Zero-Sum Equilibrium Finding [89.30539368124025]
We present a totally different approach to the problem, which is competitive and often orders of magnitude better than the prior state of the art.
With experiments on poker endgames, we demonstrate, for the first time, that modern linear program solvers are competitive against even game-specific modern variants of CFR.
arXiv Detail & Related papers (2020-06-05T13:48:26Z) - Determinantal Point Processes in Randomized Numerical Linear Algebra [80.27102478796613]
Numerical Linear Algebra (RandNLA) uses randomness to develop improved algorithms for matrix problems that arise in scientific computing, data science, machine learning, etc.
Recent work has uncovered deep and fruitful connections between DPPs and RandNLA which lead to new guarantees and improved algorithms.
arXiv Detail & Related papers (2020-05-07T00:39:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.