The Perils of Learning Before Optimizing
- URL: http://arxiv.org/abs/2106.10349v1
- Date: Fri, 18 Jun 2021 20:43:47 GMT
- Title: The Perils of Learning Before Optimizing
- Authors: Chris Cameron, Jason Hartford, Taylor Lundy, Kevin Leyton-Brown
- Abstract summary: We show how prediction models can be learned end-to-end by differentiating through the optimization task.
We show that the performance gap between a two-stage and end-to-end approach is closely related to the emphprice of correlation concept in optimization.
- Score: 16.97597806975415
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Formulating real-world optimization problems often begins with making
predictions from historical data (e.g., an optimizer that aims to recommend
fast routes relies upon travel-time predictions). Typically, learning the
prediction model used to generate the optimization problem and solving that
problem are performed in two separate stages. Recent work has showed how such
prediction models can be learned end-to-end by differentiating through the
optimization task. Such methods often yield empirical improvements, which are
typically attributed to end-to-end making better error tradeoffs than the
standard loss function used in a two-stage solution. We refine this explanation
and more precisely characterize when end-to-end can improve performance. When
prediction targets are stochastic, a two-stage solution must make an a priori
choice about which statistics of the target distribution to model -- we
consider expectations over prediction targets -- while an end-to-end solution
can make this choice adaptively. We show that the performance gap between a
two-stage and end-to-end approach is closely related to the \emph{price of
correlation} concept in stochastic optimization and show the implications of
some existing POC results for our predict-then-optimize problem. We then
consider a novel and particularly practical setting, where coefficients in the
objective function depend on multiple prediction targets. We give explicit
constructions where (1) two-stage performs unboundedly worse than end-to-end;
and (2) two-stage is optimal. We identify a large set of real-world
applications whose objective functions rely on multiple prediction targets but
which nevertheless deploy two-stage solutions. We also use simulations to
experimentally quantify performance gaps.
Related papers
- Bayesian Optimization for Non-Convex Two-Stage Stochastic Optimization Problems [2.9016548477524156]
We formulate a knowledgeient-based acquisition function to jointly optimize the first and second-stage variables.
We show that differences in the dimension and length scales between the variable types can lead to inefficiencies of the twostep algorithm.
arXiv Detail & Related papers (2024-08-30T16:26:31Z) - Loss Shaping Constraints for Long-Term Time Series Forecasting [79.3533114027664]
We present a Constrained Learning approach for long-term time series forecasting that respects a user-defined upper bound on the loss at each time-step.
We propose a practical Primal-Dual algorithm to tackle it, and aims to demonstrate that it exhibits competitive average performance in time series benchmarks, while shaping the errors across the predicted window.
arXiv Detail & Related papers (2024-02-14T18:20:44Z) - Optimizing accuracy and diversity: a multi-task approach to forecast
combinations [0.0]
We present a multi-task optimization paradigm that focuses on solving both problems simultaneously.
It incorporates an additional learning and optimization task into the standard feature-based forecasting approach.
The proposed approach elicits the essential role of diversity in feature-based forecasting.
arXiv Detail & Related papers (2023-10-31T15:26:33Z) - DF2: Distribution-Free Decision-Focused Learning [53.2476224456902]
Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems.
Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective.
We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
arXiv Detail & Related papers (2023-08-11T00:44:46Z) - Bayesian Optimization with Conformal Prediction Sets [44.565812181545645]
Conformal prediction is an uncertainty quantification method with coverage guarantees even for misspecified models.
We propose conformal Bayesian optimization, which directs queries towards regions of search space where the model predictions have guaranteed validity.
In many cases we find that query coverage can be significantly improved without harming sample-efficiency.
arXiv Detail & Related papers (2022-10-22T17:01:05Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Predict and Optimize: Through the Lens of Learning to Rank [9.434400627011108]
We show the noise contrastive estimation can be considered a case of learning to rank the solution cache.
We also develop pairwise and listwise ranking loss functions, which can be differentiated in closed form without the need of solving the optimization problem.
arXiv Detail & Related papers (2021-12-07T10:11:44Z) - RoMA: Robust Model Adaptation for Offline Model-based Optimization [115.02677045518692]
We consider the problem of searching an input maximizing a black-box objective function given a static dataset of input-output queries.
A popular approach to solving this problem is maintaining a proxy model that approximates the true objective function.
Here, the main challenge is how to avoid adversarially optimized inputs during the search.
arXiv Detail & Related papers (2021-10-27T05:37:12Z) - Approximate Bayesian Optimisation for Neural Networks [6.921210544516486]
A body of work has been done to automate machine learning algorithm to highlight the importance of model choice.
The necessity to solve the analytical tractability and the computational feasibility in a idealistic fashion enables to ensure the efficiency and the applicability.
arXiv Detail & Related papers (2021-08-27T19:03:32Z) - Fast Rates for Contextual Linear Optimization [52.39202699484225]
We show that a naive plug-in approach achieves regret convergence rates that are significantly faster than methods that directly optimize downstream decision performance.
Our results are overall positive for practice: predictive models are easy and fast to train using existing tools, simple to interpret, and, as we show, lead to decisions that perform very well.
arXiv Detail & Related papers (2020-11-05T18:43:59Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.