Related papers: Online Learning and Optimization for Queues with Unknown Demand Curve and Service Distribution

Online Learning and Optimization for Queues with Unknown Demand Curve and Service Distribution

URL: http://arxiv.org/abs/2303.03399v1
Date: Mon, 6 Mar 2023 08:47:40 GMT
Title: Online Learning and Optimization for Queues with Unknown Demand Curve and Service Distribution
Authors: Xinyun Chen, Yunan Liu, Guiyu Hong
Abstract summary: We investigate an optimization problem in a queueing system where the service provider selects the optimal service fee p and service capacity mu. We develop an online learning framework that automatically incorporates the parameter estimation errors in the solution prescription process.
Score: 26.720986177499338
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We investigate an optimization problem in a queueing system where the service provider selects the optimal service fee p and service capacity \mu to maximize the cumulative expected profit (the service revenue minus the capacity cost and delay penalty). The conventional predict-then-optimize (PTO) approach takes two steps: first, it estimates the model parameters (e.g., arrival rate and service-time distribution) from data; second, it optimizes a model based on the estimated parameters. A major drawback of PTO is that its solution accuracy can often be highly sensitive to the parameter estimation errors because PTO is unable to properly link these errors (step 1) to the quality of the optimized solutions (step 2). To remedy this issue, we develop an online learning framework that automatically incorporates the aforementioned parameter estimation errors in the solution prescription process; it is an integrated method that can "learn" the optimal solution without needing to set up the parameter estimation as a separate step as in PTO. Effectiveness of our online learning approach is substantiated by (i) theoretical results including the algorithm convergence and analysis of the regret ("cost" to pay over time for the algorithm to learn the optimal policy), and (ii) engineering confirmation via simulation experiments of a variety of representative examples. We also provide careful comparisons for PTO and the online learning method.

Related papers

Solver-Free Decision-Focused Learning for Linear Optimization Problems [6.305123652677644]
In many real-world scenarios, the parameters of the optimization problem are not known a priori and must be predicted from contextual features.<n>This gives rise to predict-then-optimize problems, where a machine learning model predicts problem parameters that are then used to make decisions via optimization.<n>We propose a solver-free training method that exploits the geometric structure of linear optimization to enable efficient training with minimal degradation in solution quality.
arXiv Detail & Related papers (2025-05-28T10:55:16Z)
Self-Supervised Penalty-Based Learning for Robust Constrained Optimization [4.297070083645049]
We propose a new methodology for parameterized constrained robust optimization, based on learning with a self-supervised penalty-based loss function. Our approach is able to effectively learn neural network approximations whose inference time is significantly smaller than the time of traditional solvers.
arXiv Detail & Related papers (2025-03-07T06:42:17Z)
Learning Joint Models of Prediction and Optimization [56.04498536842065]
Predict-Then-Then framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving. This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by joint predictive models.
arXiv Detail & Related papers (2024-09-07T19:52:14Z)
Optimal Baseline Corrections for Off-Policy Contextual Bandits [61.740094604552475]
We aim to learn decision policies that optimize an unbiased offline estimate of an online reward metric. We propose a single framework built on their equivalence in learning scenarios. Our framework enables us to characterize the variance-optimal unbiased estimator and provide a closed-form solution for it.
arXiv Detail & Related papers (2024-05-09T12:52:22Z)
End-to-End Learning for Fair Multiobjective Optimization Under Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality. This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives. It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z)
CaVE: A Cone-Aligned Approach for Fast Predict-then-optimize with Binary Linear Programs [23.00554768496448]
This work focuses on binary linear programs (BLPs) and proposes a new end-to-end training method to predict-then-optimize. Our method, Cone-aligned Vector Estimation (CaVE), aligns the predicted cost vectors with the normal cone corresponding to the true optimal solution of a training instance.
arXiv Detail & Related papers (2023-12-12T20:24:19Z)
Predict-Then-Optimize by Proxy: Learning Joint Models of Prediction and Optimization [59.386153202037086]
Predict-Then- framework uses machine learning models to predict unknown parameters of an optimization problem from features before solving. This approach can be inefficient and requires handcrafted, problem-specific rules for backpropagation through the optimization step. This paper proposes an alternative method, in which optimal solutions are learned directly from the observable features by predictive models.
arXiv Detail & Related papers (2023-11-22T01:32:06Z)
Teaching Networks to Solve Optimization Problems [13.803078209630444]
We propose to replace the iterative solvers altogether with a trainable parametric set function. We show the feasibility of learning such parametric (set) functions to solve various classic optimization problems.
arXiv Detail & Related papers (2022-02-08T19:13:13Z)
Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning. Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z)
Cost-Efficient Online Hyperparameter Optimization [94.60924644778558]
We propose an online HPO algorithm that reaches human expert-level performance within a single run of the experiment. Our proposed online HPO algorithm reaches human expert-level performance within a single run of the experiment, while incurring only modest computational overhead compared to regular training.
arXiv Detail & Related papers (2021-01-17T04:55:30Z)
Learning Augmented Index Policy for Optimal Service Placement at the Network Edge [8.136957953239254]
We consider the problem of service placement at the network edge, in which a decision maker has to choose between $N$ services to host at the edge. Our goal is to design adaptive algorithms to minimize the average service delivery latency for customers.
arXiv Detail & Related papers (2021-01-10T23:54:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.