Related papers: Causal Explainability of Machine Learning in Heart Failure Prediction from Electronic Health Records

Causal Explainability of Machine Learning in Heart Failure Prediction from Electronic Health Records

URL: http://arxiv.org/abs/2506.03068v1
Date: Tue, 03 Jun 2025 16:46:13 GMT
Title: Causal Explainability of Machine Learning in Heart Failure Prediction from Electronic Health Records
Authors: Yina Hou, Shourav B. Rabbani, Liang Hong, Norou Diawara, Manar D. Samad,
Abstract summary: The importance of clinical variables in the prognosis of the disease is explained using statistical correlation or machine learning (ML)<n>This paper uses clinical variables from a heart failure (HF) patient cohort to investigate the causal explainability of important variables obtained in statistical and ML contexts.
Score: 1.1068280788997429
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: The importance of clinical variables in the prognosis of the disease is explained using statistical correlation or machine learning (ML). However, the predictive importance of these variables may not represent their causal relationships with diseases. This paper uses clinical variables from a heart failure (HF) patient cohort to investigate the causal explainability of important variables obtained in statistical and ML contexts. Due to inherent regression modeling, popular causal discovery methods strictly assume that the cause and effect variables are numerical and continuous. This paper proposes a new computational framework to enable causal structure discovery (CSD) and score the causal strength of mixed-type (categorical, numerical, binary) clinical variables for binary disease outcomes. In HF classification, we investigate the association between the importance rank order of three feature types: correlated features, features important for ML predictions, and causal features. Our results demonstrate that CSD modeling for nonlinear causal relationships is more meaningful than its linear counterparts. Feature importance obtained from nonlinear classifiers (e.g., gradient-boosting trees) strongly correlates with the causal strength of variables without differentiating cause and effect variables. Correlated variables can be causal for HF, but they are rarely identified as effect variables. These results can be used to add the causal explanation of variables important for ML-based prediction modeling.

Related papers

Correlation vs causation in Alzheimer's disease: an interpretability-driven study [0.0]
This experiment investigates the relationships among clinical, cognitive, genetic, and biomarker features using a combination of correlation analysis, machine learning classification, and model interpretability techniques.<n>We identify key features influencing Alzheimer's disease classification, including cognitive scores and genetic risk factors.<n>Our results highlight that strong correlations do not necessarily imply causation, emphasizing the need for careful interpretation of associative data.
arXiv Detail & Related papers (2025-06-11T21:10:57Z)
Challenges in Variable Importance Ranking Under Correlation [6.718144470265263]
We present a comprehensive simulation study investigating the impact of feature correlation on the assessment of variable importance. While there is always no correlation between knockoff variables and its corresponding predictor variables, we prove that the correlation increases linearly beyond a certain correlation threshold between the predictor variables.
arXiv Detail & Related papers (2024-02-05T19:02:13Z)
Identifiable Latent Polynomial Causal Models Through the Lens of Change [82.14087963690561]
Causal representation learning aims to unveil latent high-level causal representations from observed low-level data.<n>One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability.
arXiv Detail & Related papers (2023-10-24T07:46:10Z)
A Causal Framework for Decomposing Spurious Variations [68.12191782657437]
We develop tools for decomposing spurious variations in Markovian and Semi-Markovian models. We prove the first results that allow a non-parametric decomposition of spurious effects. The described approach has several applications, ranging from explainable and fair AI to questions in epidemiology and medicine.
arXiv Detail & Related papers (2023-06-08T09:40:28Z)
Nonparametric Identifiability of Causal Representations from Unknown Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables. Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z)
Identifying Weight-Variant Latent Causal Models [82.14087963690561]
We find that transitivity acts as a key role in impeding the identifiability of latent causal representations. Under some mild assumptions, we can show that the latent causal representations can be identified up to trivial permutation and scaling. We propose a novel method, termed Structural caUsAl Variational autoEncoder, which directly learns latent causal representations and causal relationships among them.
arXiv Detail & Related papers (2022-08-30T11:12:59Z)
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure. Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs. In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z)
Discovery of Causal Additive Models in the Presence of Unobserved Variables [6.670414650224422]
Causal discovery from data affected by unobserved variables is an important but difficult problem to solve. We propose a method to identify all the causal relationships that are theoretically possible to identify without being biased by unobserved variables.
arXiv Detail & Related papers (2021-06-04T03:28:27Z)
Estimating Causal Effects with the Neural Autoregressive Density Estimator [6.59529078336196]
We use neural autoregressive density estimators to estimate causal effects within the Pearl's do-calculus framework. We show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables.
arXiv Detail & Related papers (2020-08-17T13:12:38Z)
CausalVAE: Structured Causal Disentanglement in Variational Autoencoder [52.139696854386976]
The framework of variational autoencoder (VAE) is commonly used to disentangle independent factors from observations. We propose a new VAE based framework named CausalVAE, which includes a Causal Layer to transform independent factors into causal endogenous ones. Results show that the causal representations learned by CausalVAE are semantically interpretable, and their causal relationship as a Directed Acyclic Graph (DAG) is identified with good accuracy.
arXiv Detail & Related papers (2020-04-18T20:09:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.