Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal
Approach
- URL: http://arxiv.org/abs/2312.09758v1
- Date: Fri, 15 Dec 2023 12:58:05 GMT
- Title: Diagnosing and Rectifying Fake OOD Invariance: A Restructured Causal
Approach
- Authors: Ziliang Chen, Yongsen Zheng, Zhao-Rong Lai, Quanlong Guan, Liang Lin
- Abstract summary: Invariant representation learning (IRL) encourages the prediction from invariant causal features to labels de-confounded from the environments.
Recent theoretical results verified that some causal features recovered by IRLs merely pretend domain-invariantly in the training environments but fail in unseen domains.
We develop an approach based on conditional mutual information with respect to RS-SCM, then rigorously rectify the spurious and fake invariant effects.
- Score: 51.012396632595554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Invariant representation learning (IRL) encourages the prediction from
invariant causal features to labels de-confounded from the environments,
advancing the technical roadmap of out-of-distribution (OOD) generalization.
Despite spotlights around, recent theoretical results verified that some causal
features recovered by IRLs merely pretend domain-invariantly in the training
environments but fail in unseen domains. The \emph{fake invariance} severely
endangers OOD generalization since the trustful objective can not be diagnosed
and existing causal surgeries are invalid to rectify. In this paper, we review
a IRL family (InvRat) under the Partially and Fully Informative Invariant
Feature Structural Causal Models (PIIF SCM /FIIF SCM) respectively, to certify
their weaknesses in representing fake invariant features, then, unify their
causal diagrams to propose ReStructured SCM (RS-SCM). RS-SCM can ideally
rebuild the spurious and the fake invariant features simultaneously. Given
this, we further develop an approach based on conditional mutual information
with respect to RS-SCM, then rigorously rectify the spurious and fake invariant
effects. It can be easily implemented by a small feature selection subnet
introduced in the IRL family, which is alternatively optimized to achieve our
goal. Experiments verified the superiority of our approach to fight against the
fake invariant issue across a variety of OOD generalization benchmarks.
Related papers
- Dissecting the Failure of Invariant Learning on Graphs [36.11431280689549]
We develop a Structural Causal Model (SCM) to theoretically dissect the performance of two prominent invariant learning methods.
We propose Cross-environment Intra-class Alignment (CIA), which explicitly eliminates spurious features by aligning cross-environment representations conditioned on the same class.
We further propose CIA-LRA (Localized Reweighting Alignment) that leverages the distribution of neighboring labels to selectively align node representations.
arXiv Detail & Related papers (2024-11-05T06:36:48Z) - Enlarging Feature Support Overlap for Domain Generalization [9.227839292188346]
Invariant risk minimization (IRM) addresses this issue by learning invariant features and minimizing the risk across different domains.
We propose a novel method to enlarge feature support overlap for domain generalization.
Specifically, we introduce Bayesian random data augmentation to increase sample diversity and overcome the deficiency of IRM.
arXiv Detail & Related papers (2024-07-08T09:16:42Z) - Winning Prize Comes from Losing Tickets: Improve Invariant Learning by
Exploring Variant Parameters for Out-of-Distribution Generalization [76.27711056914168]
Out-of-Distribution (OOD) Generalization aims to learn robust models that generalize well to various environments without fitting to distribution-specific features.
Recent studies based on Lottery Ticket Hypothesis (LTH) address this problem by minimizing the learning target to find some of the parameters that are critical to the task.
We propose Exploring Variant parameters for Invariant Learning (EVIL) which also leverages the distribution knowledge to find the parameters that are sensitive to distribution shift.
arXiv Detail & Related papers (2023-10-25T06:10:57Z) - Out-of-distribution Generalization with Causal Invariant Transformations [17.18953986654873]
In this work, we tackle the OOD problem without explicitly recovering the causal feature.
Under the setting of invariant causal mechanism, we theoretically show that if all such transformations are available, then we can learn a minimax optimal model.
Noticing that knowing a complete set of these causal invariant transformations may be impractical, we further show that it suffices to know only a subset of these transformations.
arXiv Detail & Related papers (2022-03-22T08:04:38Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - CC-Cert: A Probabilistic Approach to Certify General Robustness of
Neural Networks [58.29502185344086]
In safety-critical machine learning applications, it is crucial to defend models against adversarial attacks.
It is important to provide provable guarantees for deep learning models against semantically meaningful input transformations.
We propose a new universal probabilistic certification approach based on Chernoff-Cramer bounds.
arXiv Detail & Related papers (2021-09-22T12:46:04Z) - Nonlinear Invariant Risk Minimization: A Causal Approach [5.63479133344366]
We propose a learning paradigm that enables out-of-distribution generalization in the nonlinear setting.
We show identifiability of the data representation up to very simple transformations.
Extensive experiments on both synthetic and real-world datasets show that our approach significantly outperforms a variety of baseline methods.
arXiv Detail & Related papers (2021-02-24T15:38:41Z) - The Risks of Invariant Risk Minimization [52.7137956951533]
Invariant Risk Minimization is an objective based on the idea for learning deep, invariant features of data.
We present the first analysis of classification under the IRM objective--as well as these recently proposed alternatives--under a fairly natural and general model.
We show that IRM can fail catastrophically unless the test data are sufficiently similar to the training distribution--this is precisely the issue that it was intended to solve.
arXiv Detail & Related papers (2020-10-12T14:54:32Z) - Learning Invariant Representations and Risks for Semi-supervised Domain
Adaptation [109.73983088432364]
We propose the first method that aims to simultaneously learn invariant representations and risks under the setting of semi-supervised domain adaptation (Semi-DA)
We introduce the LIRR algorithm for jointly textbfLearning textbfInvariant textbfRepresentations and textbfRisks.
arXiv Detail & Related papers (2020-10-09T15:42:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.