Improving Out-of-Distribution Generalization by Adversarial Training
with Structured Priors
- URL: http://arxiv.org/abs/2210.06807v1
- Date: Thu, 13 Oct 2022 07:37:42 GMT
- Title: Improving Out-of-Distribution Generalization by Adversarial Training
with Structured Priors
- Authors: Qixun Wang, Yifei Wang, Hong Zhu, Yisen Wang
- Abstract summary: We show that sample-wise Adversarial Training (AT) has limited improvement on Out-of-Distribution (OOD) generalization.
We propose two AT variants with low-rank structures to train OOD-robust models.
Our proposed approaches outperform Empirical Risk Minimization (ERM) and sample-wise AT.
- Score: 17.936426699670864
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep models often fail to generalize well in test domains when the data
distribution differs from that in the training domain. Among numerous
approaches to address this Out-of-Distribution (OOD) generalization problem,
there has been a growing surge of interest in exploiting Adversarial Training
(AT) to improve OOD performance. Recent works have revealed that the robust
model obtained by conducting sample-wise AT also retains transferability to
biased test domains. In this paper, we empirically show that sample-wise AT has
limited improvement on OOD performance. Specifically, we find that AT can only
maintain performance at smaller scales of perturbation while Universal AT (UAT)
is more robust to larger-scale perturbations. This provides us with clues that
adversarial perturbations with universal (low dimensional) structures can
enhance the robustness against large data distribution shifts that are common
in OOD scenarios. Inspired by this, we propose two AT variants with low-rank
structures to train OOD-robust models. Extensive experiments on DomainBed
benchmark show that our proposed approaches outperform Empirical Risk
Minimization (ERM) and sample-wise AT. Our code is available at
https://github.com/NOVAglow646/NIPS22-MAT-and-LDAT-for-OOD.
Related papers
- CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection [42.33618249731874]
We show that minimizing the magnitude of energy scores on training data leads to domain-consistent Hessians of classification loss.
We have developed a unified fine-tuning framework that allows for concurrent optimization of both tasks.
arXiv Detail & Related papers (2024-05-26T03:28:59Z) - A Survey on Evaluation of Out-of-Distribution Generalization [41.39827887375374]
Out-of-Distribution (OOD) generalization is a complex and fundamental problem.
This paper serves as the first effort to conduct a comprehensive review of OOD evaluation.
We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization.
arXiv Detail & Related papers (2024-03-04T09:30:35Z) - Robustness May be More Brittle than We Think under Different Degrees of
Distribution Shifts [72.90906474654594]
We show that robustness of models can be quite brittle and inconsistent under different degrees of distribution shifts.
We observe that large-scale pre-trained models, such as CLIP, are sensitive to even minute distribution shifts of novel downstream tasks.
arXiv Detail & Related papers (2023-10-10T13:39:18Z) - Spurious Feature Diversification Improves Out-of-distribution Generalization [43.84284578270031]
Generalization to out-of-distribution (OOD) data is a critical challenge in machine learning.
We study WiSE-FT, a popular weight space ensemble method that interpolates between a pre-trained and a fine-tuned model.
We observe an unexpected FalseFalseTrue" phenomenon, in which WiSE-FT successfully corrects many cases where each individual model makes incorrect predictions.
arXiv Detail & Related papers (2023-09-29T13:29:22Z) - Unsupervised Out-of-Distribution Detection by Restoring Lossy Inputs
with Variational Autoencoder [3.498694457257263]
We propose a novel VAE-based score called Error Reduction (ER) for OOD detection.
ER is based on a VAE that takes a lossy version of the training set as inputs and the original set as targets.
arXiv Detail & Related papers (2023-09-05T09:42:15Z) - On the Robustness of Open-World Test-Time Training: Self-Training with
Dynamic Prototype Expansion [46.30241353155658]
Generalizing deep learning models to unknown target domain distribution with low latency has motivated research into test-time training/adaptation (TTT/TTA)
Many state-of-the-art methods fail to maintain the performance when the target domain is contaminated with strong out-of-distribution (OOD) data.
We develop an adaptive strong OOD pruning which improves the efficacy of the self-training TTT method.
We regularize self-training with distribution alignment and the combination yields the state-of-the-art performance on 5 OWTTT benchmarks.
arXiv Detail & Related papers (2023-08-19T08:27:48Z) - Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application.
We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data.
We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z) - Enhancing the Generalization for Intent Classification and Out-of-Domain
Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU)
Recent works have shown that using extra data and labels can improve the OOD detection performance.
This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z) - Learn what you can't learn: Regularized Ensembles for Transductive
Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios.
This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data.
We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z) - Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs.
We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.