MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms
- URL: http://arxiv.org/abs/2111.03187v1
- Date: Thu, 4 Nov 2021 22:38:18 GMT
- Title: MIRACLE: Causally-Aware Imputation via Learning Missing Data Mechanisms
- Authors: Trent Kyono, Yao Zhang, Alexis Bellot, Mihaela van der Schaar
- Abstract summary: We propose a causally-aware imputation algorithm (MIRACLE) for missing data.
MIRACLE iteratively refines the imputation of a baseline by simultaneously modeling the missingness generating mechanism.
We conduct extensive experiments on synthetic and a variety of publicly available datasets to show that MIRACLE is able to consistently improve imputation.
- Score: 82.90843777097606
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Missing data is an important problem in machine learning practice. Starting
from the premise that imputation methods should preserve the causal structure
of the data, we develop a regularization scheme that encourages any baseline
imputation method to be causally consistent with the underlying data generating
mechanism. Our proposal is a causally-aware imputation algorithm (MIRACLE).
MIRACLE iteratively refines the imputation of a baseline by simultaneously
modeling the missingness generating mechanism, encouraging imputation to be
consistent with the causal structure of the data. We conduct extensive
experiments on synthetic and a variety of publicly available datasets to show
that MIRACLE is able to consistently improve imputation over a variety of
benchmark methods across all three missingness scenarios: at random, completely
at random, and not at random.
Related papers
- Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - Machine Learning Based Missing Values Imputation in Categorical Datasets [2.5611256859404983]
This research looked into the use of machine learning algorithms to fill in the gaps in categorical datasets.
The emphasis was on ensemble models constructed using the Error Correction Output Codes framework.
Deep learning for missing data imputation has obstacles despite these encouraging results, including the requirement for large amounts of labeled data.
arXiv Detail & Related papers (2023-06-10T03:29:48Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - RandomSCM: interpretable ensembles of sparse classifiers tailored for
omics data [59.4141628321618]
We propose an ensemble learning algorithm based on conjunctions or disjunctions of decision rules.
The interpretability of the models makes them useful for biomarker discovery and patterns discovery in high dimensional data.
arXiv Detail & Related papers (2022-08-11T13:55:04Z) - MissDAG: Causal Discovery in the Presence of Missing Data with
Continuous Additive Noise Models [78.72682320019737]
We develop a general method, which we call MissDAG, to perform causal discovery from data with incomplete observations.
MissDAG maximizes the expected likelihood of the visible part of observations under the expectation-maximization framework.
We demonstrate the flexibility of MissDAG for incorporating various causal discovery algorithms and its efficacy through extensive simulations and real data experiments.
arXiv Detail & Related papers (2022-05-27T09:59:46Z) - Multiple Imputation via Generative Adversarial Network for
High-dimensional Blockwise Missing Value Problems [6.123324869194195]
We propose Multiple Imputation via Generative Adversarial Network (MI-GAN), a deep learning-based (in specific, a GAN-based) multiple imputation method.
MI-GAN shows strong performance matching existing state-of-the-art imputation methods on high-dimensional datasets.
In particular, MI-GAN significantly outperforms other imputation methods in the sense of statistical inference and computational speed.
arXiv Detail & Related papers (2021-12-21T20:19:37Z) - Scalable Intervention Target Estimation in Linear Models [52.60799340056917]
Current approaches to causal structure learning either work with known intervention targets or use hypothesis testing to discover the unknown intervention targets.
This paper proposes a scalable and efficient algorithm that consistently identifies all intervention targets.
The proposed algorithm can be used to also update a given observational Markov equivalence class into the interventional Markov equivalence class.
arXiv Detail & Related papers (2021-11-15T03:16:56Z) - Multiple Imputation with Denoising Autoencoder using Metamorphic Truth
and Imputation Feedback [0.0]
We propose a Multiple Imputation model using Denoising Autoencoders to learn the internal representation of data.
We use the novel mechanisms of Metamorphic Truth and Imputation Feedback to maintain statistical integrity of attributes.
Our approach explores the effects of imputation on various missingness mechanisms and patterns of missing data, outperforming other methods in many standard test cases.
arXiv Detail & Related papers (2020-02-19T18:26:59Z) - Causal Discovery from Incomplete Data: A Deep Learning Approach [21.289342482087267]
Imputated Causal Learning is proposed to perform iterative missing data imputation and causal structure discovery.
We show that ICL can outperform state-of-the-art methods under different missing data mechanisms.
arXiv Detail & Related papers (2020-01-15T14:28:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.