Understanding and Improving Feature Learning for Out-of-Distribution
Generalization
- URL: http://arxiv.org/abs/2304.11327v2
- Date: Sun, 29 Oct 2023 05:20:53 GMT
- Title: Understanding and Improving Feature Learning for Out-of-Distribution
Generalization
- Authors: Yongqiang Chen, Wei Huang, Kaiwen Zhou, Yatao Bian, Bo Han, James
Cheng
- Abstract summary: We propose Feature Augmented Training (FeAT) to enforce the model to learn richer features ready for OOD generalization.
FeAT iteratively augments the model to learn new features while retaining the already learned features.
Experiments show that FeAT effectively learns richer features thus boosting the performance of various OOD objectives.
- Score: 41.06375309780553
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A common explanation for the failure of out-of-distribution (OOD)
generalization is that the model trained with empirical risk minimization (ERM)
learns spurious features instead of invariant features. However, several recent
studies challenged this explanation and found that deep networks may have
already learned sufficiently good features for OOD generalization. Despite the
contradictions at first glance, we theoretically show that ERM essentially
learns both spurious and invariant features, while ERM tends to learn spurious
features faster if the spurious correlation is stronger. Moreover, when fed the
ERM learned features to the OOD objectives, the invariant feature learning
quality significantly affects the final OOD performance, as OOD objectives
rarely learn new features. Therefore, ERM feature learning can be a bottleneck
to OOD generalization. To alleviate the reliance, we propose Feature Augmented
Training (FeAT), to enforce the model to learn richer features ready for OOD
generalization. FeAT iteratively augments the model to learn new features while
retaining the already learned features. In each round, the retention and
augmentation operations are performed on different subsets of the training data
that capture distinct features. Extensive experiments show that FeAT
effectively learns richer features thus boosting the performance of various OOD
objectives.
Related papers
- Out-of-Distribution Learning with Human Feedback [26.398598663165636]
This paper presents a novel framework for OOD learning with human feedback.
Our framework capitalizes on the freely available unlabeled data in the wild.
By exploiting human feedback, we enhance the robustness and reliability of machine learning models.
arXiv Detail & Related papers (2024-08-14T18:49:27Z) - CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection [42.33618249731874]
We show that minimizing the magnitude of energy scores on training data leads to domain-consistent Hessians of classification loss.
We have developed a unified fine-tuning framework that allows for concurrent optimization of both tasks.
arXiv Detail & Related papers (2024-05-26T03:28:59Z) - Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization [11.140366256534474]
Existing vision-language models exhibit strong generalization on a variety of visual domains and tasks.
We propose a novel approach OGEN to improve the OOD GENeralization of finetuned models.
Specifically, a class-conditional feature generator is introduced to synthesize OOD features using just the class name of any unknown class.
arXiv Detail & Related papers (2024-01-29T06:57:48Z) - MOODv2: Masked Image Modeling for Out-of-Distribution Detection [57.17163962383442]
This study explores distinct pretraining tasks and employing various OOD score functions.
Our framework, MOODv2, impressively enhances 14.30% AUROC to 95.68% on ImageNet and achieves 99.98% on CIFAR-10.
arXiv Detail & Related papers (2024-01-05T02:57:58Z) - Mitigating Simplicity Bias in Deep Learning for Improved OOD
Generalization and Robustness [5.976013616522926]
We propose a framework that encourages the model to use a more diverse set of features to make predictions.
We first train a simple model, and then regularize the conditional mutual information with respect to it to obtain the final model.
We demonstrate the effectiveness of this framework in various problem settings and real-world applications.
arXiv Detail & Related papers (2023-10-09T21:19:39Z) - Spurious Feature Diversification Improves Out-of-distribution Generalization [43.84284578270031]
Generalization to out-of-distribution (OOD) data is a critical challenge in machine learning.
We study WiSE-FT, a popular weight space ensemble method that interpolates between a pre-trained and a fine-tuned model.
We observe an unexpected FalseFalseTrue" phenomenon, in which WiSE-FT successfully corrects many cases where each individual model makes incorrect predictions.
arXiv Detail & Related papers (2023-09-29T13:29:22Z) - Agree to Disagree: Diversity through Disagreement for Better
Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data.
We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z) - Confounder Identification-free Causal Visual Feature Learning [84.28462256571822]
We propose a novel Confounder Identification-free Causal Visual Feature Learning (CICF) method, which obviates the need for identifying confounders.
CICF models the interventions among different samples based on front-door criterion, and then approximates the global-scope intervening effect upon the instance-level interventions.
We uncover the relation between CICF and the popular meta-learning strategy MAML, and provide an interpretation of why MAML works from the theoretical perspective.
arXiv Detail & Related papers (2021-11-26T10:57:47Z) - Evading the Simplicity Bias: Training a Diverse Set of Models Discovers
Solutions with Superior OOD Generalization [93.8373619657239]
Neural networks trained with SGD were recently shown to rely preferentially on linearly-predictive features.
This simplicity bias can explain their lack of robustness out of distribution (OOD)
We demonstrate that the simplicity bias can be mitigated and OOD generalization improved.
arXiv Detail & Related papers (2021-05-12T12:12:24Z) - ATOM: Robustifying Out-of-distribution Detection Using Outlier Mining [51.19164318924997]
Adrial Training with informative Outlier Mining improves robustness of OOD detection.
ATOM achieves state-of-the-art performance under a broad family of classic and adversarial OOD evaluation tasks.
arXiv Detail & Related papers (2020-06-26T20:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.