Amortized learning of neural causal representations
- URL: http://arxiv.org/abs/2008.09301v1
- Date: Fri, 21 Aug 2020 04:35:06 GMT
- Title: Amortized learning of neural causal representations
- Authors: Nan Rosemary Ke, Jane. X. Wang, Jovana Mitrovic, Martin Szummer,
Danilo J. Rezende
- Abstract summary: Causal models can compactly and efficiently encode the data-generating process under all interventions.
These models are often represented as Bayesian networks and learning them scales poorly with the number of variables.
We represent a novel algorithm called textitcausal relational networks (CRN) for learning causal models using neural networks.
- Score: 10.140457813764554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal models can compactly and efficiently encode the data-generating
process under all interventions and hence may generalize better under changes
in distribution. These models are often represented as Bayesian networks and
learning them scales poorly with the number of variables. Moreover, these
approaches cannot leverage previously learned knowledge to help with learning
new causal models. In order to tackle these challenges, we represent a novel
algorithm called \textit{causal relational networks} (CRN) for learning causal
models using neural networks. The CRN represent causal models using continuous
representations and hence could scale much better with the number of variables.
These models also take in previously learned information to facilitate learning
of new causal models. Finally, we propose a decoding-based metric to evaluate
causal models with continuous representations. We test our method on synthetic
data achieving high accuracy and quick adaptation to previously unseen causal
models.
Related papers
- Phantom Embeddings: Using Embedding Space for Model Regularization in
Deep Neural Networks [12.293294756969477]
The strength of machine learning models stems from their ability to learn complex function approximations from data.
The complex models tend to memorize the training data, which results in poor regularization performance on test data.
We present a novel approach to regularize the models by leveraging the information-rich latent embeddings and their high intra-class correlation.
arXiv Detail & Related papers (2023-04-14T17:15:54Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Dynamically-Scaled Deep Canonical Correlation Analysis [77.34726150561087]
Canonical Correlation Analysis (CCA) is a method for feature extraction of two views by finding maximally correlated linear projections of them.
We introduce a novel dynamic scaling method for training an input-dependent canonical correlation model.
arXiv Detail & Related papers (2022-03-23T12:52:49Z) - Bayesian Active Learning for Discrete Latent Variable Models [19.852463786440122]
Active learning seeks to reduce the amount of data required to fit the parameters of a model.
latent variable models play a vital role in neuroscience, psychology, and a variety of other engineering and scientific disciplines.
arXiv Detail & Related papers (2022-02-27T19:07:12Z) - EINNs: Epidemiologically-Informed Neural Networks [75.34199997857341]
We introduce a new class of physics-informed neural networks-EINN-crafted for epidemic forecasting.
We investigate how to leverage both the theoretical flexibility provided by mechanistic models as well as the data-driven expressability afforded by AI models.
arXiv Detail & Related papers (2022-02-21T18:59:03Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z) - Agree to Disagree: When Deep Learning Models With Identical
Architectures Produce Distinct Explanations [0.0]
We introduce a measure of explanation consistency which we use to highlight the identified problems on the MIMIC-CXR dataset.
We find explanations of identical models but with different training setups have a low consistency: $approx$ 33% on average.
We conclude that current trends in model explanation are not sufficient to mitigate the risks of deploying models in real life healthcare applications.
arXiv Detail & Related papers (2021-05-14T12:16:47Z) - Neural Closure Models for Dynamical Systems [35.000303827255024]
We develop a novel methodology to learn non-Markovian closure parameterizations for low-fidelity models.
New "neural closure models" augment low-fidelity models with neural delay differential equations (nDDEs)
We show that using non-Markovian over Markovian closures improves long-term accuracy and requires smaller networks.
arXiv Detail & Related papers (2020-12-27T05:55:33Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.