DF2: Distribution-Free Decision-Focused Learning
- URL: http://arxiv.org/abs/2308.05889v1
- Date: Fri, 11 Aug 2023 00:44:46 GMT
- Title: DF2: Distribution-Free Decision-Focused Learning
- Authors: Lingkai Kong, Wenhao Mu, Jiaming Cui, Yuchen Zhuang, B. Aditya
Prakash, Bo Dai, Chao Zhang
- Abstract summary: Decision-focused learning (DFL) has recently emerged as a powerful approach for predictthen-optimize problems.
Existing end-to-end DFL methods are hindered by three significant bottlenecks: model error, sample average approximation error, and distribution-based parameterization of the expected objective.
We present DF2 -- the first textit-free decision-focused learning method explicitly designed to address these three bottlenecks.
- Score: 53.2476224456902
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decision-focused learning (DFL) has recently emerged as a powerful approach
for predict-then-optimize problems by customizing a predictive model to a
downstream optimization task. However, existing end-to-end DFL methods are
hindered by three significant bottlenecks: model mismatch error, sample average
approximation error, and gradient approximation error. Model mismatch error
stems from the misalignment between the model's parameterized predictive
distribution and the true probability distribution. Sample average
approximation error arises when using finite samples to approximate the
expected optimization objective. Gradient approximation error occurs as DFL
relies on the KKT condition for exact gradient computation, while most methods
approximate the gradient for backpropagation in non-convex objectives. In this
paper, we present DF2 -- the first \textit{distribution-free} decision-focused
learning method explicitly designed to address these three bottlenecks. Rather
than depending on a task-specific forecaster that requires precise model
assumptions, our method directly learns the expected optimization function
during training. To efficiently learn the function in a data-driven manner, we
devise an attention-based model architecture inspired by the distribution-based
parameterization of the expected objective. Our method is, to the best of our
knowledge, the first to address all three bottlenecks within a single model. We
evaluate DF2 on a synthetic problem, a wind power bidding problem, and a
non-convex vaccine distribution problem, demonstrating the effectiveness of
DF2.
Related papers
- Debiasing Mini-Batch Quadratics for Applications in Deep Learning [22.90473935350847]
Quadratic approximations form a fundamental building block of machine learning methods.
When computations on the entire training set are intractable - typical for deep learning - the relevant quantities are computed on mini-batches.
We show that this bias introduces a systematic error, (ii) provide a theoretical explanation for it, (iii) explain its relevance for second-order optimization and uncertainty via the Laplace approximation in deep learning, and (iv) develop and evaluate debiasing strategies.
arXiv Detail & Related papers (2024-10-18T09:37:05Z) - Bayesian Estimation and Tuning-Free Rank Detection for Probability Mass Function Tensors [17.640500920466984]
This paper presents a novel framework for estimating the joint PMF and automatically inferring its rank from observed data.
We derive a deterministic solution based on variational inference (VI) to approximate the posterior distributions of various model parameters. Additionally, we develop a scalable version of the VI-based approach by leveraging variational inference (SVI)
Experiments involving both synthetic data and real movie recommendation data illustrate the advantages of our VI and SVI-based methods in terms of estimation accuracy, automatic rank detection, and computational efficiency.
arXiv Detail & Related papers (2024-10-08T20:07:49Z) - OPUS: Occupancy Prediction Using a Sparse Set [64.60854562502523]
We present a framework to simultaneously predict occupied locations and classes using a set of learnable queries.
OPUS incorporates a suite of non-trivial strategies to enhance model performance.
Our lightest model achieves superior RayIoU on the Occ3D-nuScenes dataset at near 2x FPS, while our heaviest model surpasses previous best results by 6.1 RayIoU.
arXiv Detail & Related papers (2024-09-14T07:44:22Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Exploiting Diffusion Prior for Generalizable Dense Prediction [85.4563592053464]
Recent advanced Text-to-Image (T2I) diffusion models are sometimes too imaginative for existing off-the-shelf dense predictors to estimate.
We introduce DMP, a pipeline utilizing pre-trained T2I models as a prior for dense prediction tasks.
Despite limited-domain training data, the approach yields faithful estimations for arbitrary images, surpassing existing state-of-the-art algorithms.
arXiv Detail & Related papers (2023-11-30T18:59:44Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Adversarial Adaptive Sampling: Unify PINN and Optimal Transport for the Approximation of PDEs [2.526490864645154]
We propose a new minmax formulation to optimize simultaneously the approximate solution, given by a neural network model, and the random samples in the training set.
The key idea is to use a deep generative model to adjust random samples in the training set such that the residual induced by the approximate PDE solution can maintain a smooth profile.
arXiv Detail & Related papers (2023-05-30T02:59:18Z) - Performative Prediction with Bandit Feedback: Learning through Reparameterization [23.039885534575966]
performative prediction is a framework for studying social prediction in which the data distribution itself changes in response to the deployment of a model.
We develop a reparametization that reparametrizes the performative prediction objective as a function of induced data distribution.
arXiv Detail & Related papers (2023-05-01T21:31:29Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - SODEN: A Scalable Continuous-Time Survival Model through Ordinary
Differential Equation Networks [14.564168076456822]
We propose a flexible model for survival analysis using neural networks along with scalable optimization algorithms.
We demonstrate the effectiveness of the proposed method in comparison to existing state-of-the-art deep learning survival analysis models.
arXiv Detail & Related papers (2020-08-19T19:11:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.