Learning Where to Learn: Training Distribution Selection for Provable OOD Performance
- URL: http://arxiv.org/abs/2505.21626v1
- Date: Tue, 27 May 2025 18:00:58 GMT
- Title: Learning Where to Learn: Training Distribution Selection for Provable OOD Performance
- Authors: Nicolas Guerra, Nicholas H. Nelsen, Yunan Yang,
- Abstract summary: Out-of-distribution (OOD) generalization remains a fundamental challenge in machine learning.<n>This paper studies the design of training data distributions that maximize average-case OOD performance.
- Score: 2.7309692684728617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Out-of-distribution (OOD) generalization remains a fundamental challenge in machine learning. Models trained on one data distribution often experience substantial performance degradation when evaluated on shifted or unseen domains. To address this challenge, the present paper studies the design of training data distributions that maximize average-case OOD performance. First, a theoretical analysis establishes a family of generalization bounds that quantify how the choice of training distribution influences OOD error across a predefined family of target distributions. These insights motivate the introduction of two complementary algorithmic strategies: (i) directly formulating OOD risk minimization as a bilevel optimization problem over the space of probability measures and (ii) minimizing a theoretical upper bound on OOD error. Last, the paper evaluates the two approaches across a range of function approximation and operator learning examples. The proposed methods significantly improve OOD accuracy over standard empirical risk minimization with a fixed distribution. These results highlight the potential of distribution-aware training as a principled and practical framework for robust OOD generalization.
Related papers
- Bias as a Virtue: Rethinking Generalization under Distribution Shifts [7.389812496011288]
Machine learning models often degrade when deployed on data distributions different from their training data.<n>We show that higher in-distribution (ID) bias can lead to better out-of-distribution (OOD) generalization.<n>Our work provides both a practical method for improving generalization and a theoretical framework for reconsidering the role of bias in robust machine learning.
arXiv Detail & Related papers (2025-05-31T05:54:49Z) - Distributionally Robust Graph Out-of-Distribution Recommendation via Diffusion Model [7.92181856602497]
We design a Distributionally Robust Graph model for OOD recommendation (DRGO)<n>Specifically, our method employs a simple and effective diffusion paradigm to alleviate the noisy effect in the latent space.<n>We provide a theoretical proof of the generalization error bound of DRGO as well as a theoretical analysis of how our approach mitigates noisy sample effects.
arXiv Detail & Related papers (2025-01-26T15:07:52Z) - A Practical Theory of Generalization in Selectivity Learning [8.268822578361824]
query-driven machine learning models have emerged as a promising estimation technique for query selectivities.<n>We bridge the gaps between state-of-the-art (SOTA) theory based on the Probably Approximately Correct (PAC) learning framework.<n>We show that selectivity predictors induced by signed measures are learnable, which relaxes the reliance on probability measures in SOTA theory.
arXiv Detail & Related papers (2024-09-11T05:10:32Z) - Out-of-Distribution Learning with Human Feedback [26.398598663165636]
This paper presents a novel framework for OOD learning with human feedback.
Our framework capitalizes on the freely available unlabeled data in the wild.
By exploiting human feedback, we enhance the robustness and reliability of machine learning models.
arXiv Detail & Related papers (2024-08-14T18:49:27Z) - A Survey on Evaluation of Out-of-Distribution Generalization [41.39827887375374]
Out-of-Distribution (OOD) generalization is a complex and fundamental problem.
This paper serves as the first effort to conduct a comprehensive review of OOD evaluation.
We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization.
arXiv Detail & Related papers (2024-03-04T09:30:35Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - Graph Structure and Feature Extrapolation for Out-of-Distribution Generalization [54.64375566326931]
Out-of-distribution (OOD) generalization deals with the prevalent learning scenario where test distribution shifts from training distribution.
We propose to achieve graph OOD generalization with the novel design of non-Euclidean-space linear extrapolation.
Our design tailors OOD samples for specific shifts without corrupting underlying causal mechanisms.
arXiv Detail & Related papers (2023-06-13T18:46:28Z) - CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances.
We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data.
Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z) - Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application.
We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data.
We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z) - Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues.
We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders.
We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z) - Improved OOD Generalization via Adversarial Training and Pre-training [49.08683910076778]
In this paper, we theoretically show that a model robust to input perturbations generalizes well on OOD data.
Inspired by previous findings that adversarial training helps improve input-robustness, we show that adversarially trained models have converged excess risk on OOD data.
arXiv Detail & Related papers (2021-05-24T08:06:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.