Improved OOD Generalization via Adversarial Training and Pre-training
- URL: http://arxiv.org/abs/2105.11144v1
- Date: Mon, 24 May 2021 08:06:35 GMT
- Title: Improved OOD Generalization via Adversarial Training and Pre-training
- Authors: Mingyang Yi, Lu Hou, Jiacheng Sun, Lifeng Shang, Xin Jiang, Qun Liu,
Zhi-Ming Ma
- Abstract summary: In this paper, we theoretically show that a model robust to input perturbations generalizes well on OOD data.
Inspired by previous findings that adversarial training helps improve input-robustness, we show that adversarially trained models have converged excess risk on OOD data.
- Score: 49.08683910076778
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, learning a model that generalizes well on out-of-distribution (OOD)
data has attracted great attention in the machine learning community. In this
paper, after defining OOD generalization via Wasserstein distance, we
theoretically show that a model robust to input perturbation generalizes well
on OOD data. Inspired by previous findings that adversarial training helps
improve input-robustness, we theoretically show that adversarially trained
models have converged excess risk on OOD data, and empirically verify it on
both image classification and natural language understanding tasks. Besides, in
the paradigm of first pre-training and then fine-tuning, we theoretically show
that a pre-trained model that is more robust to input perturbation provides a
better initialization for generalization on downstream OOD data. Empirically,
after fine-tuning, this better-initialized model from adversarial pre-training
also has better OOD generalization.
Related papers
- The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection [75.65876949930258]
Out-of-distribution (OOD) detection is essential for model trustworthiness.
We show that the superior OOD detection performance of state-of-the-art methods is achieved by secretly sacrificing the OOD generalization ability.
arXiv Detail & Related papers (2024-10-12T07:02:04Z) - Out-of-Distribution Learning with Human Feedback [26.398598663165636]
This paper presents a novel framework for OOD learning with human feedback.
Our framework capitalizes on the freely available unlabeled data in the wild.
By exploiting human feedback, we enhance the robustness and reliability of machine learning models.
arXiv Detail & Related papers (2024-08-14T18:49:27Z) - Model Reprogramming Outperforms Fine-tuning on Out-of-distribution Data in Text-Image Encoders [56.47577824219207]
In this paper, we unveil the hidden costs associated with intrusive fine-tuning techniques.
We introduce a new model reprogramming approach for fine-tuning, which we name Reprogrammer.
Our empirical evidence reveals that Reprogrammer is less intrusive and yields superior downstream models.
arXiv Detail & Related papers (2024-03-16T04:19:48Z) - Towards Robust Out-of-Distribution Generalization Bounds via Sharpness [41.65692353665847]
We study the effect of sharpness on how a model tolerates data change in domain shift.
We propose a sharpness-based OOD generalization bound by taking robustness into consideration.
arXiv Detail & Related papers (2024-03-11T02:57:27Z) - A Survey on Evaluation of Out-of-Distribution Generalization [41.39827887375374]
Out-of-Distribution (OOD) generalization is a complex and fundamental problem.
This paper serves as the first effort to conduct a comprehensive review of OOD evaluation.
We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization.
arXiv Detail & Related papers (2024-03-04T09:30:35Z) - Mitigating Simplicity Bias in Deep Learning for Improved OOD
Generalization and Robustness [5.976013616522926]
We propose a framework that encourages the model to use a more diverse set of features to make predictions.
We first train a simple model, and then regularize the conditional mutual information with respect to it to obtain the final model.
We demonstrate the effectiveness of this framework in various problem settings and real-world applications.
arXiv Detail & Related papers (2023-10-09T21:19:39Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application.
We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data.
We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z) - SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in
Fine-tuned Source Code Models [58.78043959556283]
We study the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods.
Our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
arXiv Detail & Related papers (2022-10-10T16:07:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.