DF2: Distribution-Free Decision-Focused Learning
- URL: http://arxiv.org/abs/2308.05889v1
- Date: Fri, 11 Aug 2023 00:44:46 GMT
- Title: DF2: Distribution-Free Decision-Focused Learning
- Authors: Lingkai Kong, Wenhao Mu, Jiaming Cui, Yuchen Zhuang, B. Aditya
Prakash, Bo Dai, Chao Zhang
- Abstract summary: Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems.
Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective.
We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
- Score: 53.2476224456902
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision-focused learning (DFL) has recently emerged as a powerful approach
for predict-then-optimize problems by customizing a predictive model to a
downstream optimization task. However, existing end-to-end DFL methods are
hindered by three significant bottlenecks: model mismatch error, sample average
approximation error, and gradient approximation error. Model mismatch error
stems from the misalignment between the model's parameterized predictive
distribution and the true probability distribution. Sample average
approximation error arises when using finite samples to approximate the
expected optimization objective. Gradient approximation error occurs as DFL
relies on the KKT condition for exact gradient computation, while most methods
approximate the gradient for backpropagation in non-convex objectives. In this
paper, we present DF2 -- the first \textit{distribution-free} decision-focused
learning method explicitly designed to address these three bottlenecks. Rather
than depending on a task-specific forecaster that requires precise model
assumptions, our method directly learns the expected optimization function
during training. To efficiently learn the function in a data-driven manner, we
devise an attention-based model architecture inspired by the distribution-based
parameterization of the expected objective. Our method is, to the best of our
knowledge, the first to address all three bottlenecks within a single model. We
evaluate DF2 on a synthetic problem, a wind power bidding problem, and a
non-convex vaccine distribution problem, demonstrating the effectiveness of
DF2.
Related papers
- Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Gradient Guidance for Diffusion Models: An Optimization Perspective [45.6080199096424]
We study the theoretic aspects of a guided score-based sampling process.
We show that adding gradient guidance to the sampling process of a pre-trained diffusion model is essentially equivalent to solving a regularized optimization problem.
arXiv Detail & Related papers (2024-04-23T04:51:02Z) - Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate.
We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks.
Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Adversarial Adaptive Sampling: Unify PINN and Optimal Transport for the Approximation of PDEs [2.526490864645154]
We propose a new minmax formulation to optimize simultaneously the approximate solution, given by a neural network model, and the random samples in the training set.
The key idea is to use a deep generative model to adjust random samples in the training set such that the residual induced by the approximate PDE solution can maintain a smooth profile.
arXiv Detail & Related papers (2023-05-30T02:59:18Z) - Error Bounds for Flow Matching Methods [38.9898500163582]
Flow matching methods approximate a flow between two arbitrary probability distributions.
We present error bounds for the flow matching procedure using fully deterministic sampling, assuming an $L2$ bound on the approximation error and a certain regularity on the data distributions.
arXiv Detail & Related papers (2023-05-26T12:13:53Z) - Performative Prediction with Bandit Feedback: Learning through
Reparameterization [25.169419772432796]
We develop a framework that reparametrizes the performative prediction as a function of the induced data distribution.
We provide a regret bound that is sublinear in the total number of performative samples taken and is only in the dimension of the model parameter.
On the application side, we believe our method is useful for large online recommendation systems like YouTube or TokTok.
arXiv Detail & Related papers (2023-05-01T21:31:29Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z) - SODEN: A Scalable Continuous-Time Survival Model through Ordinary
Differential Equation Networks [14.564168076456822]
We propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms.
We demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.
arXiv Detail & Related papers (2020-08-19T19:11:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.