Causal Discovery from Incomplete Data using An Encoder and Reinforcement
Learning
- URL: http://arxiv.org/abs/2006.05554v1
- Date: Tue, 9 Jun 2020 23:33:47 GMT
- Title: Causal Discovery from Incomplete Data using An Encoder and Reinforcement
Learning
- Authors: Xiaoshui Huang, Fujin Zhu, Lois Holloway, Ali Haidar
- Abstract summary: We propose an approach to discover causal structures from incomplete data by using a novel encoder and reinforcement learning (RL)
The encoder is designed for missing data imputation as well as feature extraction.
Our method takes the incomplete observational data as input and generates a causal structure graph.
- Score: 2.4469484645516837
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Discovering causal structure among a set of variables is a fundamental
problem in many domains. However, state-of-the-art methods seldom consider the
possibility that the observational data has missing values (incomplete data),
which is ubiquitous in many real-world situations. The missing value will
significantly impair the performance and even make the causal discovery
algorithms fail. In this paper, we propose an approach to discover causal
structures from incomplete data by using a novel encoder and reinforcement
learning (RL). The encoder is designed for missing data imputation as well as
feature extraction. In particular, it learns to encode the currently available
information (with missing values) into a robust feature representation which is
then used to determine where to search the best graph. The encoder is
integrated into a RL framework that can be optimized using the actor-critic
algorithm. Our method takes the incomplete observational data as input and
generates a causal structure graph. Experimental results on synthetic and real
data demonstrate that our method can robustly generate causal structures from
incomplete data. Compared with the direct combination of data imputation and
causal discovery methods, our method performs generally better and can even
obtain a performance gain as much as 43.2%.
Related papers
- Optimal Transport for Structure Learning Under Missing Data [31.240965564055138]
We propose a score-based algorithm for learning causal structures from missing data based on optimal transport.
Our framework is shown to recover the true causal structure more effectively than competing methods in most simulations and real-data settings.
arXiv Detail & Related papers (2024-02-23T10:49:04Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms [82.90843777097606]
We propose a causally-aware imputation algorithm (MIRACLE) for missing data.
MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism.
We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
arXiv Detail & Related papers (2021-11-04T22:38:18Z) - Greedy structure learning from data that contains systematic missing
values [13.088541054366527]
Learning from data that contain missing values represents a common phenomenon in many domains.
Relatively few Bayesian Network structure learning algorithms account for missing data.
This paper describes three variants of greedy search structure learning that utilise pairwise deletion and inverse probability weighting.
arXiv Detail & Related papers (2021-07-09T02:56:44Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Learning Causal Models Online [103.87959747047158]
Predictive models can rely on spurious correlations in the data for making predictions.
One solution for achieving strong generalization is to incorporate causal structures in the models.
We propose an online algorithm that continually detects and removes spurious features.
arXiv Detail & Related papers (2020-06-12T20:49:20Z) - Establishing strong imputation performance of a denoising autoencoder in
a wide range of missing data problems [0.0]
We develop a consistent framework for both training and imputation.
We benchmarked the results against state-of-the-art imputation methods.
The developed autoencoder obtained the smallest error for all ranges of initial data corruption.
arXiv Detail & Related papers (2020-04-06T12:00:30Z) - Auto-Encoding Twin-Bottleneck Hashing [141.5378966676885]
This paper proposes an efficient and adaptive code-driven graph.
It is updated by decoding in the context of an auto-encoder.
Experiments on benchmarked datasets clearly show the superiority of our framework over the state-of-the-art hashing methods.
arXiv Detail & Related papers (2020-02-27T05:58:12Z) - Multiple Imputation with Denoising Autoencoder using Metamorphic Truth
and Imputation Feedback [0.0]
We propose a Multiple Imputation model using Denoising Autoencoders to learn the internal representation of data.
We use the novel mechanisms of Metamorphic Truth and Imputation Feedback to maintain statistical integrity of attributes.
Our approach explores the effects of imputation on various missingness mechanisms and patterns of missing data, outperforming other methods in many standard test cases.
arXiv Detail & Related papers (2020-02-19T18:26:59Z) - Causal Discovery from Incomplete Data: A Deep Learning Approach [21.289342482087267]
Imputated Causal Learning is proposed to perform iterative missing data imputation and causal structure discovery.
We show that ICL can outperform state-of-the-art methods under different missing data mechanisms.
arXiv Detail & Related papers (2020-01-15T14:28:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.