Online Resource Allocation with Convex-set Machine-Learned Advice
- URL: http://arxiv.org/abs/2306.12282v1
- Date: Wed, 21 Jun 2023 14:09:33 GMT
- Title: Online Resource Allocation with Convex-set Machine-Learned Advice
- Authors: Negin Golrezaei and Patrick Jaillet and Zijie Zhou
- Abstract summary: We introduce a parameterized class of optimal online resource allocation algorithms that strike a balance between consistent and robust ratios.
Specifically, in a C-reto optimal setting, we maximize the consistent ratio while ensuring that the robust ratio is at least C.
- Score: 27.662388663465006
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decision-makers often have access to a machine-learned prediction about
demand, referred to as advice, which can potentially be utilized in online
decision-making processes for resource allocation. However, exploiting such
advice poses challenges due to its potential inaccuracy. To address this issue,
we propose a framework that enhances online resource allocation decisions with
potentially unreliable machine-learned (ML) advice. We assume here that this
advice is represented by a general convex uncertainty set for the demand
vector.
We introduce a parameterized class of Pareto optimal online resource
allocation algorithms that strike a balance between consistent and robust
ratios. The consistent ratio measures the algorithm's performance (compared to
the optimal hindsight solution) when the ML advice is accurate, while the
robust ratio captures performance under an adversarial demand process when the
advice is inaccurate. Specifically, in a C-Pareto optimal setting, we maximize
the robust ratio while ensuring that the consistent ratio is at least C. Our
proposed C-Pareto optimal algorithm is an adaptive protection level algorithm,
which extends the classical fixed protection level algorithm introduced in
Littlewood (2005) and Ball and Queyranne (2009). Solving a complex non-convex
continuous optimization problem characterizes the adaptive protection level
algorithm. To complement our algorithms, we present a simple method for
computing the maximum achievable consistent ratio, which serves as an estimate
for the maximum value of the ML advice. Additionally, we present numerical
studies to evaluate the performance of our algorithm in comparison to benchmark
algorithms. The results demonstrate that by adjusting the parameter C, our
algorithms effectively strike a balance between worst-case and average
performance, outperforming the benchmark algorithms.
Related papers
- Overcoming Brittleness in Pareto-Optimal Learning-Augmented Algorithms [6.131022957085439]
We propose a new framework in which the smoothness in the performance of the algorithm is enforced by means of a user-specified profile.
We apply this new approach to a well-studied online problem, namely the one-way trading problem.
arXiv Detail & Related papers (2024-08-07T23:15:21Z) - Deterministic Trajectory Optimization through Probabilistic Optimal Control [3.2771631221674333]
We propose two new algorithms for discrete-time deterministic finite-horizon nonlinear optimal control problems.
Both algorithms are inspired by a novel theoretical paradigm known as probabilistic optimal control.
We show that the application of this algorithm results in a fixed point of probabilistic policies that converge to the deterministic optimal policy.
arXiv Detail & Related papers (2024-07-18T09:17:47Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - Selection of the Most Probable Best [2.1095005405219815]
We consider an expected-value ranking and selection (R&S) problem where all k solutions' simulation outputs depend on a common parameter whose uncertainty can be modeled by a distribution.
We define the most probable best (MPB) to be the solution that has the largest probability of being optimal with respect to the distribution.
We devise a series of algorithms that replace the unknown means in the optimality conditions with their estimates and prove the algorithms' sampling ratios achieve the conditions as the simulation budget increases.
arXiv Detail & Related papers (2022-07-15T15:27:27Z) - On the Optimality of Batch Policy Optimization Algorithms [106.89498352537682]
Batch policy optimization considers leveraging existing data for policy construction before interacting with an environment.
We show that any confidence-adjusted index algorithm is minimax optimal, whether it be optimistic, pessimistic or neutral.
We introduce a new weighted-minimax criterion that considers the inherent difficulty of optimal value prediction.
arXiv Detail & Related papers (2021-04-06T05:23:20Z) - Soft-Robust Algorithms for Batch Reinforcement Learning [36.78967245470449]
In reinforcement learning, robust decision-making problems with limited data are usually computed by the percentile criterion.
We show that the percentile criterion is non- theoretical as it is difficult to optimize and ignores the mean performance.
We propose and analyze two algorithms to approximately optimize the percentile criterion.
arXiv Detail & Related papers (2020-11-30T01:36:16Z) - Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems.
We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Stochastic Optimization Forests [60.523606291705214]
We show how to train forest decision policies by growing trees that choose splits to directly optimize the downstream decision quality, rather than splitting to improve prediction accuracy as in the standard random forest algorithm.
We show that our approximate splitting criteria can reduce running time hundredfold, while achieving performance close to forest algorithms that exactly re-optimize for every candidate split.
arXiv Detail & Related papers (2020-08-17T16:56:06Z) - Boosting Algorithms for Estimating Optimal Individualized Treatment
Rules [4.898659895355356]
We present nonparametric algorithms for estimating optimal individualized treatment rules.
The proposed algorithms are based on the XGBoost algorithm, which is known as one of the most powerful algorithms in the machine learning literature.
arXiv Detail & Related papers (2020-01-31T22:26:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.