Towards Robust Out-of-Distribution Generalization Bounds via Sharpness
- URL: http://arxiv.org/abs/2403.06392v1
- Date: Mon, 11 Mar 2024 02:57:27 GMT
- Title: Towards Robust Out-of-Distribution Generalization Bounds via Sharpness
- Authors: Yingtian Zou, Kenji Kawaguchi, Yingnan Liu, Jiashuo Liu, Mong-Li Lee,
Wynne Hsu
- Abstract summary: We study the effect of sharpness on how a model tolerates data change in domain shift.
We propose a sharpness-based OOD generalization bound by taking robustness into consideration.
- Score: 41.65692353665847
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalizing to out-of-distribution (OOD) data or unseen domain, termed OOD
generalization, still lacks appropriate theoretical guarantees. Canonical OOD
bounds focus on different distance measurements between source and target
domains but fail to consider the optimization property of the learned model. As
empirically shown in recent work, the sharpness of learned minima influences
OOD generalization. To bridge this gap between optimization and OOD
generalization, we study the effect of sharpness on how a model tolerates data
change in domain shift which is usually captured by "robustness" in
generalization. In this paper, we give a rigorous connection between sharpness
and robustness, which gives better OOD guarantees for robust algorithms. It
also provides a theoretical backing for "flat minima leads to better OOD
generalization". Overall, we propose a sharpness-based OOD generalization bound
by taking robustness into consideration, resulting in a tighter bound than
non-robust guarantees. Our findings are supported by the experiments on a ridge
regression model, as well as the experiments on deep learning classification
tasks.
Related papers
- The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection [75.65876949930258]
Out-of-distribution (OOD) detection is essential for model trustworthiness.
We show that the superior OOD detection performance of state-of-the-art methods is achieved by secretly sacrificing the OOD generalization ability.
arXiv Detail & Related papers (2024-10-12T07:02:04Z) - CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection [42.33618249731874]
We show that minimizing the magnitude of energy scores on training data leads to domain-consistent Hessians of classification loss.
We have developed a unified fine-tuning framework that allows for concurrent optimization of both tasks.
arXiv Detail & Related papers (2024-05-26T03:28:59Z) - A Survey on Evaluation of Out-of-Distribution Generalization [41.39827887375374]
Out-of-Distribution (OOD) generalization is a complex and fundamental problem.
This paper serves as the first effort to conduct a comprehensive review of OOD evaluation.
We categorize existing research into three paradigms: OOD performance testing, OOD performance prediction, and OOD intrinsic property characterization.
arXiv Detail & Related papers (2024-03-04T09:30:35Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - Improved OOD Generalization via Conditional Invariant Regularizer [43.62211060412388]
We show that given a class label, conditionally independent models of spurious attributes are OOD general.
Based on this, metric Conditional Variation (CSV) which controls OOD error is proposed to measure such conditional independence.
An algorithm with minicave convergence rate is proposed to solve the problem.
arXiv Detail & Related papers (2022-07-14T06:34:21Z) - Towards a Theoretical Framework of Out-of-Distribution Generalization [28.490842160921805]
Generalization to out-of-distribution (OOD) data, or domain generalization, is one of the central problems in modern machine learning.
In this work, we take the first step towards rigorous and quantitative definitions of what is OOD; and what does it mean by saying an OOD problem is learnable.
arXiv Detail & Related papers (2021-06-08T16:32:23Z) - Provably Robust Detection of Out-of-distribution Data (almost) for free [124.14121487542613]
Deep neural networks are known to produce highly overconfident predictions on out-of-distribution (OOD) data.
In this paper we propose a novel method where from first principles we combine a certifiable OOD detector with a standard classifier into an OOD aware classifier.
In this way we achieve the best of two worlds: certifiably adversarially robust OOD detection, even for OOD samples close to the in-distribution, without loss in prediction accuracy and close to state-of-the-art OOD detection performance for non-manipulated OOD data.
arXiv Detail & Related papers (2021-06-08T11:40:49Z) - Improved OOD Generalization via Adversarial Training and Pre-training [49.08683910076778]
In this paper, we theoretically show that a model robust to input perturbations generalizes well on OOD data.
Inspired by previous findings that adversarial training helps improve input-robustness, we show that adversarially trained models have converged excess risk on OOD data.
arXiv Detail & Related papers (2021-05-24T08:06:35Z) - ATOM: Robustifying Out-of-distribution Detection Using Outlier Mining [51.19164318924997]
Adrial Training with informative Outlier Mining improves robustness of OOD detection.
ATOM achieves state-of-the-art performance under a broad family of classic and adversarial OOD evaluation tasks.
arXiv Detail & Related papers (2020-06-26T20:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.