Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing
- URL: http://arxiv.org/abs/2410.05980v1
- Date: Tue, 8 Oct 2024 12:26:48 GMT
- Title: Generalizing to any diverse distribution: uniformity, gentle finetuning and rebalancing
- Authors: Andreas Loukas, Karolis Martinkus, Ed Wagstaff, Kyunghyun Cho,
- Abstract summary: We aim to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data.
Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge.
We adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain.
- Score: 55.791818510796645
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As training datasets grow larger, we aspire to develop models that generalize well to any diverse test distribution, even if the latter deviates significantly from the training data. Various approaches like domain adaptation, domain generalization, and robust optimization attempt to address the out-of-distribution challenge by posing assumptions about the relation between training and test distribution. Differently, we adopt a more conservative perspective by accounting for the worst-case error across all sufficiently diverse test distributions within a known domain. Our first finding is that training on a uniform distribution over this domain is optimal. We also interrogate practical remedies when uniform samples are unavailable by considering methods for mitigating non-uniformity through finetuning and rebalancing. Our theory provides a mathematical grounding for previous observations on the role of entropy and rebalancing for o.o.d. generalization and foundation model training. We also provide new empirical evidence across tasks involving o.o.d. shifts which illustrate the broad applicability of our perspective.
Related papers
- Any-Shift Prompting for Generalization over Distributions [66.29237565901734]
We propose any-shift prompting: a general probabilistic inference framework that considers the relationship between training and test distributions during prompt learning.
Within this framework, the test prompt exploits the distribution relationships to guide the generalization of the CLIP image-language model from training to any test distribution.
The network generates the tailored test prompt with both training and test information in a feedforward pass, avoiding extra training costs at test time.
arXiv Detail & Related papers (2024-02-15T16:53:42Z) - Invariant Anomaly Detection under Distribution Shifts: A Causal
Perspective [6.845698872290768]
Anomaly detection (AD) is the machine learning task of identifying highly discrepant abnormal samples.
Under the constraints of a distribution shift, the assumption that training samples and test samples are drawn from the same distribution breaks down.
We attempt to increase the resilience of anomaly detection models to different kinds of distribution shifts.
arXiv Detail & Related papers (2023-12-21T23:20:47Z) - Distribution Shift Inversion for Out-of-Distribution Prediction [57.22301285120695]
We propose a portable Distribution Shift Inversion algorithm for Out-of-Distribution (OoD) prediction.
We show that our method provides a general performance gain when plugged into a wide range of commonly used OoD algorithms.
arXiv Detail & Related papers (2023-06-14T08:00:49Z) - Assaying Out-Of-Distribution Generalization in Transfer Learning [103.57862972967273]
We take a unified view of previous work, highlighting message discrepancies that we address empirically.
We fine-tune over 31k networks, from nine different architectures in the many- and few-shot setting.
arXiv Detail & Related papers (2022-07-19T12:52:33Z) - Transferring Fairness under Distribution Shifts via Fair Consistency
Regularization [15.40257564187799]
We study how to transfer model fairness under distribution shifts, a widespread issue in practice.
Inspired by the success of self-training in transferring accuracy under domain shifts, we derive a sufficient condition for transferring group fairness.
arXiv Detail & Related papers (2022-06-26T06:19:56Z) - Robust Calibration with Multi-domain Temperature Scaling [86.07299013396059]
We develop a systematic calibration model to handle distribution shifts by leveraging data from multiple domains.
Our proposed method -- multi-domain temperature scaling -- uses the robustness in the domains to improve calibration under distribution shift.
arXiv Detail & Related papers (2022-06-06T17:32:12Z) - Domain Conditional Predictors for Domain Adaptation [3.951376400628575]
We consider a conditional modeling approach in which predictions, in addition to being dependent on the input data, use information relative to the underlying data-generating distribution.
We argue that such an approach is more generally applicable than current domain adaptation methods.
arXiv Detail & Related papers (2021-06-25T22:15:54Z) - Out-of-Distribution Generalization in Kernel Regression [21.958028127426196]
We study generalization in kernel regression when the training and test distributions are different.
We identify an overlap matrix that quantifies the mismatch between distributions for a given kernel.
We develop procedures for optimizing training and test distributions for a given data budget to find best and worst case generalizations under the shift.
arXiv Detail & Related papers (2021-06-04T04:54:25Z) - Understanding Generalization in Adversarial Training via the
Bias-Variance Decomposition [39.108491135488286]
We decompose the test risk into its bias and variance components.
We find that the bias increases monotonically with perturbation size and is the dominant term in the risk.
We show that popular explanations for the generalization gap instead predict the variance to be monotonic.
arXiv Detail & Related papers (2021-03-17T23:30:00Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.