Generalizable Information Theoretic Causal Representation
- URL: http://arxiv.org/abs/2202.08388v1
- Date: Thu, 17 Feb 2022 00:38:35 GMT
- Title: Generalizable Information Theoretic Causal Representation
- Authors: Mengyue Yang, Xinyu Cai, Furui Liu, Xu Chen, Zhitang Chen, Jianye Hao,
Jun Wang
- Abstract summary: We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
- Score: 37.54158138447033
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is evidence that representation learning can improve model's performance
over multiple downstream tasks in many real-world scenarios, such as image
classification and recommender systems. Existing learning approaches rely on
establishing the correlation (or its proxy) between features and the downstream
task (labels), which typically results in a representation containing cause,
effect and spurious correlated variables of the label. Its generalizability may
deteriorate because of the unstability of the non-causal parts. In this paper,
we propose to learn causal representation from observational data by
regularizing the learning procedure with mutual information measures according
to our hypothetical causal graph. The optimization involves a counterfactual
loss, based on which we deduce a theoretical guarantee that the
causality-inspired learning is with reduced sample complexity and better
generalization ability. Extensive experiments show that the models trained on
causal representations learned by our approach is robust under adversarial
attacks and distribution shift.
Related papers
- Revisiting Spurious Correlation in Domain Generalization [12.745076668687748]
We build a structural causal model (SCM) to describe the causality within data generation process.
We further conduct a thorough analysis of the mechanisms underlying spurious correlation.
In this regard, we propose to control confounding bias in OOD generalization by introducing a propensity score weighted estimator.
arXiv Detail & Related papers (2024-06-17T13:22:00Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Specify Robust Causal Representation from Mixed Observations [35.387451486213344]
Learning representations purely from observations concerns the problem of learning a low-dimensional, compact representation which is beneficial to prediction models.
We develop a learning method to learn such representation from observational data by regularizing the learning procedure with mutual information measures.
We theoretically and empirically show that the models trained with the learned causal representations are more robust under adversarial attacks and distribution shifts.
arXiv Detail & Related papers (2023-10-21T02:18:35Z) - Inducing Causal Structure for Abstractive Text Summarization [76.1000380429553]
We introduce a Structural Causal Model (SCM) to induce the underlying causal structure of the summarization data.
We propose a Causality Inspired Sequence-to-Sequence model (CI-Seq2Seq) to learn the causal representations that can mimic the causal factors.
Experimental results on two widely used text summarization datasets demonstrate the advantages of our approach.
arXiv Detail & Related papers (2023-08-24T16:06:36Z) - A Causal Ordering Prior for Unsupervised Representation Learning [27.18951912984905]
Causal representation learning argues that factors of variation in a dataset are, in fact, causally related.
We propose a fully unsupervised representation learning method that considers a data generation process with a latent additive noise model.
arXiv Detail & Related papers (2023-07-11T18:12:05Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Causal Transportability for Visual Recognition [70.13627281087325]
We show that standard classifiers fail because the association between images and labels is not transportable across settings.
We then show that the causal effect, which severs all sources of confounding, remains invariant across domains.
This motivates us to develop an algorithm to estimate the causal effect for image classification.
arXiv Detail & Related papers (2022-04-26T15:02:11Z) - Towards Robust and Adaptive Motion Forecasting: A Causal Representation
Perspective [72.55093886515824]
We introduce a causal formalism of motion forecasting, which casts the problem as a dynamic process with three groups of latent variables.
We devise a modular architecture that factorizes the representations of invariant mechanisms and style confounders to approximate a causal graph.
Experiment results on synthetic and real datasets show that our three proposed components significantly improve the robustness and reusability of the learned motion representations.
arXiv Detail & Related papers (2021-11-29T18:59:09Z) - Counterfactual Adversarial Learning with Representation Interpolation [11.843735677432166]
We introduce Counterfactual Adrial Training framework to tackle the problem from aversa causality perspective.
Experiments demonstrate that CAT achieves substantial performance improvement over SOTA across different downstream tasks.
arXiv Detail & Related papers (2021-09-10T09:23:08Z) - A Meta Learning Approach to Discerning Causal Graph Structure [1.52292571922932]
We explore the usage of meta-learning to derive the causal direction between variables by optimizing over a measure of distribution simplicity.
We incorporate a graph representation which includes latent variables and allows for more generalizability and graph structure expression.
Our model is able to learn causal direction indicators for complex graph structures despite effects of latent confounders.
arXiv Detail & Related papers (2021-06-06T22:44:44Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.