Modular Learning of Deep Causal Generative Models for High-dimensional
Causal Inference
- URL: http://arxiv.org/abs/2401.01426v1
- Date: Tue, 2 Jan 2024 20:31:15 GMT
- Title: Modular Learning of Deep Causal Generative Models for High-dimensional
Causal Inference
- Authors: Md Musfiqur Rahman and Murat Kocaoglu
- Abstract summary: We propose a sequential training algorithm that can train a deep causal generative model and can provably sample from identifiable interventional and counterfactual distributions.
Our algorithm, called Modular-DCM, uses adversarial training to learn the network weights, and to the best of our knowledge, is the first algorithm that can make use of pre-trained models and provably sample from any identifiable causal query.
- Score: 6.52423450125622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pearl's causal hierarchy establishes a clear separation between
observational, interventional, and counterfactual questions. Researchers
proposed sound and complete algorithms to compute identifiable causal queries
at a given level of the hierarchy using the causal structure and data from the
lower levels of the hierarchy. However, most of these algorithms assume that we
can accurately estimate the probability distribution of the data, which is an
impractical assumption for high-dimensional variables such as images. On the
other hand, modern generative deep learning architectures can be trained to
learn how to accurately sample from such high-dimensional distributions.
Especially with the recent rise of foundation models for images, it is
desirable to leverage pre-trained models to answer causal queries with such
high-dimensional data. To address this, we propose a sequential training
algorithm that, given the causal structure and a pre-trained conditional
generative model, can train a deep causal generative model, which utilizes the
pre-trained model and can provably sample from identifiable interventional and
counterfactual distributions. Our algorithm, called Modular-DCM, uses
adversarial training to learn the network weights, and to the best of our
knowledge, is the first algorithm that can make use of pre-trained models and
provably sample from any identifiable causal query in the presence of latent
confounders with high-dimensional data. We demonstrate the utility of our
algorithm using semi-synthetic and real-world datasets containing images as
variables in the causal structure.
Related papers
- Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - Validation Diagnostics for SBI algorithms based on Normalizing Flows [55.41644538483948]
This work proposes easy to interpret validation diagnostics for multi-dimensional conditional (posterior) density estimators based on NF.
It also offers theoretical guarantees based on results of local consistency.
This work should help the design of better specified models or drive the development of novel SBI-algorithms.
arXiv Detail & Related papers (2022-11-17T15:48:06Z) - Learning Single-Index Models with Shallow Neural Networks [43.6480804626033]
We introduce a natural class of shallow neural networks and study its ability to learn single-index models via gradient flow.
We show that the corresponding optimization landscape is benign, which in turn leads to generalization guarantees that match the near-optimal sample complexity of dedicated semi-parametric methods.
arXiv Detail & Related papers (2022-10-27T17:52:58Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Modeling Item Response Theory with Stochastic Variational Inference [8.369065078321215]
We introduce a variational Bayesian inference algorithm for Item Response Theory (IRT)
Applying this method to five large-scale item response datasets yields higher log likelihoods and higher accuracy in imputing missing data.
The algorithm implementation is open-source, and easily usable.
arXiv Detail & Related papers (2021-08-26T05:00:27Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - Testing for Typicality with Respect to an Ensemble of Learned
Distributions [5.850572971372637]
One-sample approaches to the goodness-of-fit problem offer significant computational advantages for online testing.
The ability to correctly reject anomalous data in this setting hinges on the accuracy of the model of the base distribution.
Existing methods for the one-sample goodness-of-fit problem do not account for the fact that a model of the base distribution is learned.
We propose training an ensemble of density models, considering data to be anomalous if the data is anomalous with respect to any member of the ensemble.
arXiv Detail & Related papers (2020-11-11T19:47:46Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z) - Learning Generative Models using Denoising Density Estimators [29.068491722778827]
We introduce a new generative model based on denoising density estimators (DDEs)
Our main contribution is a novel technique to obtain generative models by minimizing the KL-divergence directly.
Experimental results demonstrate substantial improvement in density estimation and competitive performance in generative model training.
arXiv Detail & Related papers (2020-01-08T20:30:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.