Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization
- URL: http://arxiv.org/abs/2202.06856v1
- Date: Mon, 14 Feb 2022 16:42:16 GMT
- Title: Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient
for Out-of-Distribution Generalization
- Authors: Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski
- Abstract summary: We argue that devising simpler methods for learning predictors on existing features is a promising direction for future research.
We introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift.
Under a natural model, we prove that the DARE solution is the minimax-optimal predictor for a constrained set of test distributions.
- Score: 52.7137956951533
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A common explanation for the failure of deep networks to generalize
out-of-distribution is that they fail to recover the "correct" features.
Focusing on the domain generalization setting, we challenge this notion with a
simple experiment which suggests that ERM already learns sufficient features
and that the current bottleneck is not feature learning, but robust regression.
We therefore argue that devising simpler methods for learning predictors on
existing features is a promising direction for future research. Towards this
end, we introduce Domain-Adjusted Regression (DARE), a convex objective for
learning a linear predictor that is provably robust under a new model of
distribution shift. Rather than learning one function, DARE performs a
domain-specific adjustment to unify the domains in a canonical latent space and
learns to predict in this space. Under a natural model, we prove that the DARE
solution is the minimax-optimal predictor for a constrained set of test
distributions. Further, we provide the first finite-environment convergence
guarantee to the minimax risk, improving over existing results which show a
"threshold effect". Evaluated on finetuned features, we find that DARE compares
favorably to prior methods, consistently achieving equal or better performance.
Related papers
- Automatic debiasing of neural networks via moment-constrained learning [0.0]
Naively learning the regression function and taking a sample mean of the target functional results in biased estimators.
We propose moment-constrained learning as a new RR learning approach that addresses some shortcomings in automatic debiasing.
arXiv Detail & Related papers (2024-09-29T20:56:54Z) - Mitigating Covariate Shift in Misspecified Regression with Applications
to Reinforcement Learning [39.02112341007981]
We study the effect of distribution shift in the presence of model misspecification.
We show that empirical risk minimization, or standard least squares regression, can result in undesirable misspecification amplification.
We develop a new algorithm that avoids this undesirable behavior, resulting in no misspecification amplification while still obtaining optimal statistical rates.
arXiv Detail & Related papers (2024-01-22T18:59:12Z) - Improving Adaptive Conformal Prediction Using Self-Supervised Learning [72.2614468437919]
We train an auxiliary model with a self-supervised pretext task on top of an existing predictive model and use the self-supervised error as an additional feature to estimate nonconformity scores.
We empirically demonstrate the benefit of the additional information using both synthetic and real data on the efficiency (width), deficit, and excess of conformal prediction intervals.
arXiv Detail & Related papers (2023-02-23T18:57:14Z) - Modeling Uncertain Feature Representation for Domain Generalization [49.129544670700525]
We show that our method consistently improves the network generalization ability on multiple vision tasks.
Our methods are simple yet effective and can be readily integrated into networks without additional trainable parameters or loss constraints.
arXiv Detail & Related papers (2023-01-16T14:25:02Z) - Domain-Specific Risk Minimization for Out-of-Distribution Generalization [104.17683265084757]
We first establish a generalization bound that explicitly considers the adaptivity gap.
We propose effective gap estimation methods for guiding the selection of a better hypothesis for the target.
The other method is minimizing the gap directly by adapting model parameters using online target samples.
arXiv Detail & Related papers (2022-08-18T06:42:49Z) - Model Optimization in Imbalanced Regression [2.580765958706854]
Imbalanced domain learning aims to produce accurate models in predicting instances that, though underrepresented, are of utmost importance for the domain.
One of the main reasons for this is the lack of loss functions capable of focusing on minimizing the errors of extreme (rare) values.
Recently, an evaluation metric was introduced: Squared Error Relevance Area (SERA)
This metric posits a bigger emphasis on the errors committed at extreme values while also accounting for the performance in the overall target variable domain.
arXiv Detail & Related papers (2022-06-20T20:23:56Z) - Deep learning: a statistical viewpoint [120.94133818355645]
Deep learning has revealed some major surprises from a theoretical perspective.
In particular, simple gradient methods easily find near-perfect solutions to non-optimal training problems.
We conjecture that specific principles underlie these phenomena.
arXiv Detail & Related papers (2021-03-16T16:26:36Z) - A Novel Regression Loss for Non-Parametric Uncertainty Optimization [7.766663822644739]
Quantification of uncertainty is one of the most promising approaches to establish safe machine learning.
One of the most commonly used approaches so far is Monte Carlo dropout, which is computationally cheap and easy to apply in practice.
We propose a new objective, referred to as second-moment loss ( UCI), to address this issue.
arXiv Detail & Related papers (2021-01-07T19:12:06Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.