On the Generalization and Adaption Performance of Causal Models
- URL: http://arxiv.org/abs/2206.04620v1
- Date: Thu, 9 Jun 2022 17:12:32 GMT
- Title: On the Generalization and Adaption Performance of Causal Models
- Authors: Nino Scherrer, Anirudh Goyal, Stefan Bauer, Yoshua Bengio, Nan
Rosemary Ke
- Abstract summary: Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
- Score: 99.64022680811281
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning models that offer robust out-of-distribution generalization and fast
adaptation is a key challenge in modern machine learning. Modelling causal
structure into neural networks holds the promise to accomplish robust zero and
few-shot adaptation. Recent advances in differentiable causal discovery have
proposed to factorize the data generating process into a set of modules, i.e.
one module for the conditional distribution of every variable where only causal
parents are used as predictors. Such a modular decomposition of knowledge
enables adaptation to distributions shifts by only updating a subset of
parameters. In this work, we systematically study the generalization and
adaption performance of such modular neural causal models by comparing it to
monolithic models and structured models where the set of predictors is not
constrained to causal parents. Our analysis shows that the modular neural
causal models outperform other models on both zero and few-shot adaptation in
low data regimes and offer robust generalization. We also found that the
effects are more significant for sparser graphs as compared to denser graphs.
Related papers
- Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Robust Graph Representation Learning via Predictive Coding [46.22695915912123]
Predictive coding is a message-passing framework initially developed to model information processing in the brain.
In this work, we build models that rely on the message-passing rule of predictive coding.
We show that the proposed models are comparable to standard ones in terms of performance in both inductive and transductive tasks.
arXiv Detail & Related papers (2022-12-09T03:58:22Z) - Hypothesis Testing using Causal and Causal Variational Generative Models [0.0]
Causal Gen and Causal Variational Gen can utilize nonparametric structural causal knowledge combined with a deep learning functional approximation.
We show how, using a deliberate (non-random) split of training and testing data, these models can generalize better to similar, but out-of-distribution data points.
We validate our methods on a synthetic pendulum dataset, as well as a trauma surgery ground level fall dataset.
arXiv Detail & Related papers (2022-10-20T13:46:15Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Deep Generative Modeling on Limited Data with Regularization by
Nontransferable Pre-trained Models [32.52492468276371]
We propose regularized deep generative model (Reg-DGM) to reduce the variance of generative modeling with limited data.
Reg-DGM uses a pre-trained model to optimize a weighted sum of a certain divergence and the expectation of an energy function.
Empirically, with various pre-trained feature extractors and a data-dependent energy function, Reg-DGM consistently improves the generation performance of strong DGMs with limited data.
arXiv Detail & Related papers (2022-08-30T10:28:50Z) - Unifying Epidemic Models with Mixtures [28.771032745045428]
The COVID-19 pandemic has emphasized the need for a robust understanding of epidemic models.
Here, we introduce a simple mixture-based model which bridges the two approaches.
Although the model is non-mechanistic, we show that it arises as the natural outcome of a process based on a networked SIR framework.
arXiv Detail & Related papers (2022-01-07T19:42:05Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z) - Amortized learning of neural causal representations [10.140457813764554]
Causal models can compactly and efficiently encode the data-generating process under all interventions.
These models are often represented as Bayesian networks and learning them scales poorly with the number of variables.
We represent a novel algorithm called textitcausal relational networks (CRN) for learning causal models using neural networks.
arXiv Detail & Related papers (2020-08-21T04:35:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.