An Analysis of the Adaptation Speed of Causal Models
- URL: http://arxiv.org/abs/2005.09136v2
- Date: Thu, 25 Feb 2021 11:48:05 GMT
- Title: An Analysis of the Adaptation Speed of Causal Models
- Authors: R\'emi Le Priol, Reza Babanezhad Harikandeh, Yoshua Bengio and Simon
Lacoste-Julien
- Abstract summary: Recently, Bengio et al. conjectured that among all candidate models, $G$ is the fastest to adapt from one dataset to another.
We investigate the adaptation speed of cause-effect SCMs using convergence rates from optimization.
Surprisingly, we find situations where the anticausal model is advantaged, falsifying the initial hypothesis.
- Score: 80.77896315374747
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Consider a collection of datasets generated by unknown interventions on an
unknown structural causal model $G$. Recently, Bengio et al. (2020) conjectured
that among all candidate models, $G$ is the fastest to adapt from one dataset
to another, along with promising experiments. Indeed, intuitively $G$ has less
mechanisms to adapt, but this justification is incomplete. Our contribution is
a more thorough analysis of this hypothesis. We investigate the adaptation
speed of cause-effect SCMs. Using convergence rates from stochastic
optimization, we justify that a relevant proxy for adaptation speed is distance
in parameter space after intervention. Applying this proxy to categorical and
normal cause-effect models, we show two results. When the intervention is on
the cause variable, the SCM with the correct causal direction is advantaged by
a large factor. When the intervention is on the effect variable, we
characterize the relative adaptation speed. Surprisingly, we find situations
where the anticausal model is advantaged, falsifying the initial hypothesis.
Code to reproduce experiments is available at
https://github.com/remilepriol/causal-adaptation-speed
Related papers
- Hypothesizing Missing Causal Variables with LLMs [55.28678224020973]
We formulate a novel task where the input is a partial causal graph with missing variables, and the output is a hypothesis about the missing variables to complete the partial graph.
We show the strong ability of LLMs to hypothesize the mediation variables between a cause and its effect.
We also observe surprising results where some of the open-source models outperform the closed GPT-4 model.
arXiv Detail & Related papers (2024-09-04T10:37:44Z) - Adaptation Speed Analysis for Fairness-aware Causal Models [34.116613732724815]
In machine translation tasks, to achieve bidirectional translation between two languages, the source corpus is often used as the target corpus.
The question of which one can adapt most quickly to a domain shift is of significant importance in many fields.
arXiv Detail & Related papers (2023-08-31T17:36:57Z) - Hawkes Processes with Delayed Granger Causality [9.664517084506718]
We explicitly model the delayed Granger causal effects based on multivariate Hawkes processes.
We infer the posterior distribution of the time lags and understand how this distribution varies across different scenarios.
We empirically evaluate our model's event prediction and time-lag inference accuracy on synthetic and real data.
arXiv Detail & Related papers (2023-08-11T12:43:43Z) - Distinguishing Cause from Effect on Categorical Data: The Uniform
Channel Model [0.0]
Distinguishing cause from effect using observations of a pair of random variables is a core problem in causal discovery.
We propose a criterion to address the cause-effect problem with categorical variables.
We select as the most likely causal direction the one in which the conditional probability mass function is closer to a uniform channel (UC)
arXiv Detail & Related papers (2023-03-14T13:54:11Z) - Decoding Causality by Fictitious VAR Modeling [0.0]
We first set up an equilibrium for the cause-effect relations using a fictitious vector autoregressive model.
In the equilibrium, long-run relations are identified from noise, and spurious ones are negligibly close to zero.
We also apply the approach to estimating the causal factors' contribution to climate change.
arXiv Detail & Related papers (2021-11-14T22:43:02Z) - Variance Minimization in the Wasserstein Space for Invariant Causal
Prediction [72.13445677280792]
In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors.
Each of these tests relies on the minimization of a novel loss function that is derived from tools in optimal transport theory.
We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.
arXiv Detail & Related papers (2021-10-13T22:30:47Z) - On the Role of Optimization in Double Descent: A Least Squares Study [30.44215064390409]
We show an excess risk bound for the descent gradient solution of the least squares objective.
We find that in case of noiseless regression, double descent is explained solely by optimization-related quantities.
We empirically explore if our predictions hold for neural networks.
arXiv Detail & Related papers (2021-07-27T09:13:11Z) - Variational Causal Networks: Approximate Bayesian Inference over Causal
Structures [132.74509389517203]
We introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs.
In experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.
arXiv Detail & Related papers (2021-06-14T17:52:49Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Counterfactual Invariance to Spurious Correlations: Why and How to Pass
Stress Tests [87.60900567941428]
A spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter.
In machine learning, these have a know-it-when-you-see-it character.
We study stress testing using the tools of causal inference.
arXiv Detail & Related papers (2021-05-31T14:39:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.