Risk Guarantees for End-to-End Prediction and Optimization Processes
- URL: http://arxiv.org/abs/2012.15046v1
- Date: Wed, 30 Dec 2020 05:20:26 GMT
- Title: Risk Guarantees for End-to-End Prediction and Optimization Processes
- Authors: Nam Ho-Nguyen and Fatma K{\i}l{\i}n\c{c}-Karzan
- Abstract summary: We study conditions that allow us to explicitly describe how the prediction performance governs the optimization performance.
We derive the exact theoretical relationship between prediction performance measured with the squared loss, as well as a class of symmetric loss functions, and the subsequent optimization performance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prediction models are often employed in estimating parameters of optimization
models. Despite the fact that in an end-to-end view, the real goal is to
achieve good optimization performance, the prediction performance is measured
on its own. While it is usually believed that good prediction performance in
estimating the parameters will result in good subsequent optimization
performance, formal theoretical guarantees on this are notably lacking. In this
paper, we explore conditions that allow us to explicitly describe how the
prediction performance governs the optimization performance. Our weaker
condition allows for an asymptotic convergence result, while our stronger
condition allows for exact quantification of the optimization performance in
terms of the prediction performance. In general, verification of these
conditions is a non-trivial task. Nevertheless, we show that our weaker
condition is equivalent to the well-known Fisher consistency concept from the
learning theory literature. This then allows us to easily check our weaker
condition for several loss functions. We also establish that the squared error
loss function satisfies our stronger condition. Consequently, we derive the
exact theoretical relationship between prediction performance measured with the
squared loss, as well as a class of symmetric loss functions, and the
subsequent optimization performance. In a computational study on portfolio
optimization, fractional knapsack and multiclass classification problems, we
compare the optimization performance of using of several prediction loss
functions (some that are Fisher consistent and some that are not) and
demonstrate that lack of consistency of the loss function can indeed have a
detrimental effect on performance.
Related papers
- A Novel Unified Parametric Assumption for Nonconvex Optimization [53.943470475510196]
Non optimization is central to machine learning, but the general framework non convexity enables weak convergence guarantees too pessimistic compared to the other hand.
We introduce a novel unified assumption in non convex algorithms.
arXiv Detail & Related papers (2025-02-17T21:25:31Z) - Variational Inference with Coverage Guarantees in Simulation-Based Inference [18.818573945984873]
We propose Conformalized Amortized Neural Variational Inference (CANVI)
CANVI constructs conformalized predictors based on each candidate, compares the predictors using a metric known as predictive efficiency, and returns the most efficient predictor.
We prove lower bounds on the predictive efficiency of the regions produced by CANVI and explore how the quality of a posterior approximation relates to the predictive efficiency of prediction regions based on that approximation.
arXiv Detail & Related papers (2023-05-23T17:24:04Z) - Bayesian Optimization with Conformal Prediction Sets [44.565812181545645]
Conformal prediction is an uncertainty quantification method with coverage guarantees even for misspecified models.
We propose conformal Bayesian optimization, which directs queries towards regions of search space where the model predictions have guaranteed validity.
In many cases we find that query coverage can be significantly improved without harming sample-efficiency.
arXiv Detail & Related papers (2022-10-22T17:01:05Z) - Generalizing Bayesian Optimization with Decision-theoretic Entropies [102.82152945324381]
We consider a generalization of Shannon entropy from work in statistical decision theory.
We first show that special cases of this entropy lead to popular acquisition functions used in BO procedures.
We then show how alternative choices for the loss yield a flexible family of acquisition functions.
arXiv Detail & Related papers (2022-10-04T04:43:58Z) - Efficient and Differentiable Conformal Prediction with General Function
Classes [96.74055810115456]
We propose a generalization of conformal prediction to multiple learnable parameters.
We show that it achieves approximate valid population coverage and near-optimal efficiency within class.
Experiments show that our algorithm is able to learn valid prediction sets and improve the efficiency significantly.
arXiv Detail & Related papers (2022-02-22T18:37:23Z) - Implicit Rate-Constrained Optimization of Non-decomposable Objectives [37.43791617018009]
We consider a family of constrained optimization problems arising in machine learning.
Our key idea is to formulate a rate-constrained optimization that expresses the threshold parameter as a function of the model parameters.
We show how the resulting optimization problem can be solved using standard gradient based methods.
arXiv Detail & Related papers (2021-07-23T00:04:39Z) - The Perils of Learning Before Optimizing [16.97597806975415]
We show how prediction models can be learned end-to-end by differentiating through the optimization task.
We show that the performance gap between a two-stage and end-to-end approach is closely related to the emphprice of correlation concept in optimization.
arXiv Detail & Related papers (2021-06-18T20:43:47Z) - Towards More Fine-grained and Reliable NLP Performance Prediction [85.78131503006193]
We make two contributions to improving performance prediction for NLP tasks.
First, we examine performance predictors for holistic measures of accuracy like F1 or BLEU.
Second, we propose methods to understand the reliability of a performance prediction model from two angles: confidence intervals and calibration.
arXiv Detail & Related papers (2021-02-10T15:23:20Z) - Fast Rates for Contextual Linear Optimization [52.39202699484225]
We show that a naive plug-in approach achieves regret convergence rates that are significantly faster than methods that directly optimize downstream decision performance.
Our results are overall positive for practice: predictive models are easy and fast to train using existing tools, simple to interpret, and, as we show, lead to decisions that perform very well.
arXiv Detail & Related papers (2020-11-05T18:43:59Z) - Mixability of Integral Losses: a Key to Efficient Online Aggregation of Functional and Probabilistic Forecasts [72.32459441619388]
We adapt basic mixable (and exponentially concave) loss functions to compare functional predictions and prove that these adaptations are also mixable (exp-concave)
As an application of our main result, we prove that various loss functions used for probabilistic forecasting are mixable (exp-concave)
arXiv Detail & Related papers (2019-12-15T14:25:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.