What is the Right Notion of Distance between Predict-then-Optimize Tasks?
- URL: http://arxiv.org/abs/2409.06997v1
- Date: Wed, 11 Sep 2024 04:13:17 GMT
- Title: What is the Right Notion of Distance between Predict-then-Optimize Tasks?
- Authors: Paula Rodriguez-Diaz, Lingkai Kong, Kai Wang, David Alvarez-Melis, Milind Tambe,
- Abstract summary: We show that traditional dataset distances, which rely solely on feature and label dimensions, lack informativeness in the Predict-then-then (PtO) context.
We propose a new dataset distance that incorporates the impacts of downstream decisions.
Our results show that this decision-aware dataset distance effectively captures adaptation success in PtO contexts.
- Score: 35.842182348661076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Comparing datasets is a fundamental task in machine learning, essential for various learning paradigms; from evaluating train and test datasets for model generalization to using dataset similarity for detecting data drift. While traditional notions of dataset distances offer principled measures of similarity, their utility has largely been assessed through prediction error minimization. However, in Predict-then-Optimize (PtO) frameworks, where predictions serve as inputs for downstream optimization tasks, model performance is measured through decision regret minimization rather than prediction error minimization. In this work, we (i) show that traditional dataset distances, which rely solely on feature and label dimensions, lack informativeness in the PtO context, and (ii) propose a new dataset distance that incorporates the impacts of downstream decisions. Our results show that this decision-aware dataset distance effectively captures adaptation success in PtO contexts, providing a PtO adaptation bound in terms of dataset distance. Empirically, we show that our proposed distance measure accurately predicts transferability across three different PtO tasks from the literature.
Related papers
- Towards Data-Efficient Pretraining for Atomic Property Prediction [51.660835328611626]
We show that pretraining on a task-relevant dataset can match or surpass large-scale pretraining.
We introduce the Chemical Similarity Index (CSI), a novel metric inspired by computer vision's Fr'echet Inception Distance.
arXiv Detail & Related papers (2025-02-16T11:46:23Z) - Capturing the Temporal Dependence of Training Data Influence [100.91355498124527]
We formalize the concept of trajectory-specific leave-one-out influence, which quantifies the impact of removing a data point during training.
We propose data value embedding, a novel technique enabling efficient approximation of trajectory-specific LOO.
As data value embedding captures training data ordering, it offers valuable insights into model training dynamics.
arXiv Detail & Related papers (2024-12-12T18:28:55Z) - TAROT: Targeted Data Selection via Optimal Transport [64.56083922130269]
TAROT is a targeted data selection framework grounded in optimal transport theory.
Previous targeted data selection methods rely on influence-based greedys to enhance domain-specific performance.
We evaluate TAROT across multiple tasks, including semantic segmentation, motion prediction, and instruction tuning.
arXiv Detail & Related papers (2024-11-30T10:19:51Z) - RealTraj: Towards Real-World Pedestrian Trajectory Forecasting [10.332817296500533]
We propose a novel framework, RealTraj, that enhances the real-world applicability of trajectory forecasting.
We present Det2TrajFormer, a trajectory forecasting model that remains invariant in tracking noise by using past detections as inputs.
Unlike previous trajectory forecasting methods, our approach fine-tunes the model using only ground-truth detections, significantly reducing the need for costly person ID annotations.
arXiv Detail & Related papers (2024-11-26T12:35:26Z) - Improving Transferability for Cross-domain Trajectory Prediction via
Neural Stochastic Differential Equation [41.09061877498741]
discrepancies exist among datasets due to external factors and data acquisition strategies.
The proficient performance of models trained on large-scale datasets has limited transferability on other small-size datasets.
We propose a method based on continuous and utilization of Neural Differential Equations (NSDE) for alleviating discrepancies.
The effectiveness of our method is validated against state-of-the-art trajectory prediction models on the popular benchmark datasets: nuScenes, Argoverse, Lyft, INTERACTION, and Open Motion dataset.
arXiv Detail & Related papers (2023-12-26T06:50:29Z) - PPI++: Efficient Prediction-Powered Inference [31.403415618169433]
We present PPI++: a methodology for estimation and inference based on a small labeled dataset and a typically much larger dataset of machine-learning predictions.
The methods automatically adapt to the quality of available predictions, yielding easy-to-compute confidence sets.
PPI++ builds on prediction-powered inference (PPI), which targets the same problem setting, improving its computational and statistical efficiency.
arXiv Detail & Related papers (2023-11-02T17:59:04Z) - Prediction-Oriented Bayesian Active Learning [51.426960808684655]
Expected predictive information gain (EPIG) is an acquisition function that measures information gain in the space of predictions rather than parameters.
EPIG leads to stronger predictive performance compared with BALD across a range of datasets and models.
arXiv Detail & Related papers (2023-04-17T10:59:57Z) - Estimation of Local Average Treatment Effect by Data Combination [3.655021726150368]
It is important to estimate the local average treatment effect (LATE) when compliance with a treatment assignment is incomplete.
Previously proposed methods for LATE estimation required all relevant variables to be jointly observed in a single dataset.
We propose a weighted least squares estimator that enables simpler model selection by avoiding the minimax objective formulation.
arXiv Detail & Related papers (2021-09-11T03:51:48Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.