Improved OOD Generalization via Conditional Invariant Regularizer
- URL: http://arxiv.org/abs/2207.06687v1
- Date: Thu, 14 Jul 2022 06:34:21 GMT
- Title: Improved OOD Generalization via Conditional Invariant Regularizer
- Authors: Mingyang Yi and Ruoyu Wang and Jiachen Sun and Zhenguo Li and Zhi-Ming
Ma
- Abstract summary: We show that given a class label, conditionally independent models of spurious attributes are OOD general.
Based on this, metric Conditional Variation (CSV) which controls OOD error is proposed to measure such conditional independence.
An algorithm with minicave convergence rate is proposed to solve the problem.
- Score: 43.62211060412388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, generalization on out-of-distribution (OOD) data with correlation
shift has attracted great attention. The correlation shift is caused by the
spurious attributes that correlate to the class label, as the correlation
between them may vary in training and test data. For such a problem, we show
that given the class label, the conditionally independent models of spurious
attributes are OOD generalizable. Based on this, a metric Conditional Spurious
Variation (CSV) which controls OOD generalization error, is proposed to measure
such conditional independence. To improve the OOD generalization, we regularize
the training process with the proposed CSV. Under mild assumptions, our
training objective can be formulated as a nonconvex-concave mini-max problem.
An algorithm with provable convergence rate is proposed to solve the problem.
Extensive empirical results verify our algorithm's efficacy in improving OOD
generalization.
Related papers
- CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection [42.33618249731874]
We show that minimizing the magnitude of energy scores on training data leads to domain-consistent Hessians of classification loss.
We have developed a unified fine-tuning framework that allows for concurrent optimization of both tasks.
arXiv Detail & Related papers (2024-05-26T03:28:59Z) - Towards Robust Out-of-Distribution Generalization Bounds via Sharpness [41.65692353665847]
We study the effect of sharpness on how a model tolerates data change in domain shift.
We propose a sharpness-based OOD generalization bound by taking robustness into consideration.
arXiv Detail & Related papers (2024-03-11T02:57:27Z) - How Does Unlabeled Data Provably Help Out-of-Distribution Detection? [63.41681272937562]
Unlabeled in-the-wild data is non-trivial due to the heterogeneity of both in-distribution (ID) and out-of-distribution (OOD) data.
This paper introduces a new learning framework SAL (Separate And Learn) that offers both strong theoretical guarantees and empirical effectiveness.
arXiv Detail & Related papers (2024-02-05T20:36:33Z) - Mixture Data for Training Cannot Ensure Out-of-distribution Generalization [21.801115344132114]
We show that increasing the size of training data does not always lead to a reduction in the test generalization error.
In this work, we quantitatively redefine OOD data as those situated outside the convex hull of mixed training data.
Our proof of the new risk bound agrees that the efficacy of well-trained models can be guaranteed for unseen data.
arXiv Detail & Related papers (2023-12-25T11:00:38Z) - Towards Calibrated Robust Fine-Tuning of Vision-Language Models [97.19901765814431]
This work proposes a robust fine-tuning method that improves both OOD accuracy and confidence calibration simultaneously in vision language models.
We show that both OOD classification and OOD calibration errors have a shared upper bound consisting of two terms of ID data.
Based on this insight, we design a novel framework that conducts fine-tuning with a constrained multimodal contrastive loss enforcing a larger smallest singular value.
arXiv Detail & Related papers (2023-11-03T05:41:25Z) - Improving Out-of-Distribution Generalization by Adversarial Training
with Structured Priors [17.936426699670864]
We show that sample-wise Adversarial Training (AT) has limited improvement on Out-of-Distribution (OOD) generalization.
We propose two AT variants with low-rank structures to train OOD-robust models.
Our proposed approaches outperform Empirical Risk Minimization (ERM) and sample-wise AT.
arXiv Detail & Related papers (2022-10-13T07:37:42Z) - Training OOD Detectors in their Natural Habitats [31.565635192716712]
Out-of-distribution (OOD) detection is important for machine learning models deployed in the wild.
Recent methods use auxiliary outlier data to regularize the model for improved OOD detection.
We propose a novel framework that leverages wild mixture data -- that naturally consists of both ID and OOD samples.
arXiv Detail & Related papers (2022-02-07T15:38:39Z) - Exploring Covariate and Concept Shift for Detection and Calibration of
Out-of-Distribution Data [77.27338842609153]
characterization reveals that sensitivity to each type of shift is important to the detection and confidence calibration of OOD data.
We propose a geometrically-inspired method to improve OOD detection under both shifts with only in-distribution data.
We are the first to propose a method that works well across both OOD detection and calibration and under different types of shifts.
arXiv Detail & Related papers (2021-10-28T15:42:55Z) - CASTLE: Regularization via Auxiliary Causal Graph Discovery [89.74800176981842]
We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables.
CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features.
arXiv Detail & Related papers (2020-09-28T09:49:38Z) - MUTANT: A Training Paradigm for Out-of-Distribution Generalization in
Visual Question Answering [58.30291671877342]
We present MUTANT, a training paradigm that exposes the model to perceptually similar, yet semantically distinct mutations of the input.
MUTANT establishes a new state-of-the-art accuracy on VQA-CP with a $10.57%$ improvement.
arXiv Detail & Related papers (2020-09-18T00:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.