Related papers: Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors

Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors

URL: http://arxiv.org/abs/2210.06807v1
Date: Thu, 13 Oct 2022 07:37:42 GMT
Title: Improving Out-of-Distribution Generalization by Adversarial Training with Structured Priors
Authors: Qixun Wang, Yifei Wang, Hong Zhu, Yisen Wang
Abstract summary: We show that sample-wise Adversarial Training (AT) has limited improvement on Out-of-Distribution (OOD) generalization. We propose two AT variants with low-rank structures to train OOD-robust models. Our proposed approaches outperform Empirical Risk Minimization (ERM) and sample-wise AT.
Score: 17.936426699670864
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep models often fail to generalize well in test domains when the data distribution differs from that in the training domain. Among numerous approaches to address this Out-of-Distribution (OOD) generalization problem, there has been a growing surge of interest in exploiting Adversarial Training (AT) to improve OOD performance. Recent works have revealed that the robust model obtained by conducting sample-wise AT also retains transferability to biased test domains. In this paper, we empirically show that sample-wise AT has limited improvement on OOD performance. Specifically, we find that AT can only maintain performance at smaller scales of perturbation while Universal AT (UAT) is more robust to larger-scale perturbations. This provides us with clues that adversarial perturbations with universal (low dimensional) structures can enhance the robustness against large data distribution shifts that are common in OOD scenarios. Inspired by this, we propose two AT variants with low-rank structures to train OOD-robust models. Extensive experiments on DomainBed benchmark show that our proposed approaches outperform Empirical Risk Minimization (ERM) and sample-wise AT. Our code is available at https://github.com/NOVAglow646/NIPS22-MAT-and-LDAT-for-OOD.

Related papers

DisCoPatch: Batch Statistics Are All You Need For OOD Detection, But Only If You Can Trust Them [7.0477485974331895]
Out-of-distribution (OOD) detection holds significant importance across many applications. We introduce DisCoPatch, an unsupervised Adversarial Variational Autoencoder framework that harnesses this mechanism. DisCoPatch achieves state-of-the-art results in public OOD detection benchmarks.
arXiv Detail & Related papers (2025-01-14T10:49:26Z)
CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection [42.33618249731874]
We show that minimizing the magnitude of energy scores on training data leads to domain-consistent Hessians of classification loss. We have developed a unified fine-tuning framework that allows for concurrent optimization of both tasks.
arXiv Detail & Related papers (2024-05-26T03:28:59Z)
A Survey on Evaluation of Out-of-Distribution Generalization [41.39827887375374]
Out-of-Distribution (OOD) generalization is a complex and fundamental problem. This paper serves as the first effort to conduct a comprehensive review of OOD evaluation. We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization.
arXiv Detail & Related papers (2024-03-04T09:30:35Z)
Robustness May be More Brittle than We Think under Different Degrees of Distribution Shifts [72.90906474654594]
We show that robustness of models can be quite brittle and inconsistent under different degrees of distribution shifts. We observe that large-scale pre-trained models, such as CLIP, are sensitive to even minute distribution shifts of novel downstream tasks.
arXiv Detail & Related papers (2023-10-10T13:39:18Z)
Spurious Feature Diversification Improves Out-of-distribution Generalization [43.84284578270031]
Generalization to out-of-distribution (OOD) data is a critical challenge in machine learning. We study WiSE-FT, a popular weight space ensemble method that interpolates between a pre-trained and a fine-tuned model. We observe an unexpected FalseFalseTrue" phenomenon, in which WiSE-FT successfully corrects many cases where each individual model makes incorrect predictions.
arXiv Detail & Related papers (2023-09-29T13:29:22Z)
Unsupervised Out-of-Distribution Detection by Restoring Lossy Inputs with Variational Autoencoder [3.498694457257263]
We propose a novel VAE-based score called Error Reduction (ER) for OOD detection. ER is based on a VAE that takes a lossy version of the training set as inputs and the original set as targets.
arXiv Detail & Related papers (2023-09-05T09:42:15Z)
On the Robustness of Open-World Test-Time Training: Self-Training with Dynamic Prototype Expansion [46.30241353155658]
Generalizing deep learning models to unknown target domain distribution with low latency has motivated research into test-time training/adaptation (TTT/TTA) Many state-of-the-art methods fail to maintain the performance when the target domain is contaminated with strong out-of-distribution (OOD) data. We develop an adaptive strong OOD pruning which improves the efficacy of the self-training TTT method. We regularize self-training with distribution alignment and the combination yields the state-of-the-art performance on 5 OWTTT benchmarks.
arXiv Detail & Related papers (2023-08-19T08:27:48Z)
Pseudo-OOD training for robust language models [78.15712542481859]
OOD detection is a key component of a reliable machine-learning model for any industry-scale application. We propose POORE - POsthoc pseudo-Ood REgularization, that generates pseudo-OOD samples using in-distribution (IND) data. We extensively evaluate our framework on three real-world dialogue systems, achieving new state-of-the-art in OOD detection.
arXiv Detail & Related papers (2022-10-17T14:32:02Z)
Enhancing the Generalization for Intent Classification and Out-of-Domain Detection in SLU [70.44344060176952]
Intent classification is a major task in spoken language understanding (SLU) Recent works have shown that using extra data and labels can improve the OOD detection performance. This paper proposes to train a model with only IND data while supporting both IND intent classification and OOD detection.
arXiv Detail & Related papers (2021-06-28T08:27:38Z)
Learn what you can't learn: Regularized Ensembles for Transductive Out-of-distribution Detection [76.39067237772286]
We show that current out-of-distribution (OOD) detection algorithms for neural networks produce unsatisfactory results in a variety of OOD detection scenarios. This paper studies how such "hard" OOD scenarios can benefit from adjusting the detection method after observing a batch of the test data. We propose a novel method that uses an artificial labeling scheme for the test data and regularization to obtain ensembles of models that produce contradictory predictions only on the OOD samples in a test batch.
arXiv Detail & Related papers (2020-12-10T16:55:13Z)
Robust Out-of-distribution Detection for Neural Networks [51.19164318924997]
We show that existing detection mechanisms can be extremely brittle when evaluating on in-distribution and OOD inputs. We propose an effective algorithm called ALOE, which performs robust training by exposing the model to both adversarially crafted inlier and outlier examples.
arXiv Detail & Related papers (2020-03-21T17:46:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.