Learning interpretable causal networks from very large datasets,
application to 400,000 medical records of breast cancer patients
- URL: http://arxiv.org/abs/2303.06423v1
- Date: Sat, 11 Mar 2023 15:18:19 GMT
- Title: Learning interpretable causal networks from very large datasets,
application to 400,000 medical records of breast cancer patients
- Authors: Marcel da C\^amara Ribeiro-Dantas, Honghao Li, Vincent Cabeli, Louise
Dupuis, Franck Simon, Liza Hettal, Anne-Sophie Hamy, and Herv\'e Isambert
- Abstract summary: We report a more reliable and scalable causal discovery method (iMIIC) based on a general mutual information supremum principle.
We showcase iMIIC on synthetic and real-life healthcare data from 396,179 breast cancer patients from the US Surveillance, Epidemiology, and End Results program.
- Score: 1.2647816797166165
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Discovering causal effects is at the core of scientific investigation but
remains challenging when only observational data is available. In practice,
causal networks are difficult to learn and interpret, and limited to relatively
small datasets. We report a more reliable and scalable causal discovery method
(iMIIC), based on a general mutual information supremum principle, which
greatly improves the precision of inferred causal relations while
distinguishing genuine causes from putative and latent causal effects. We
showcase iMIIC on synthetic and real-life healthcare data from 396,179 breast
cancer patients from the US Surveillance, Epidemiology, and End Results
program. More than 90\% of predicted causal effects appear correct, while the
remaining unexpected direct and indirect causal effects can be interpreted in
terms of diagnostic procedures, therapeutic timing, patient preference or
socio-economic disparity. iMIIC's unique capabilities open up new avenues to
discover reliable and interpretable causal networks across a range of research
fields.
Related papers
- Smoke and Mirrors in Causal Downstream Tasks [59.90654397037007]
This paper looks at the causal inference task of treatment effect estimation, where the outcome of interest is recorded in high-dimensional observations.
We compare 6 480 models fine-tuned from state-of-the-art visual backbones, and find that the sampling and modeling choices significantly affect the accuracy of the causal estimate.
Our results suggest that future benchmarks should carefully consider real downstream scientific questions, especially causal ones.
arXiv Detail & Related papers (2024-05-27T13:26:34Z) - Understanding Breast Cancer Survival: Using Causality and Language
Models on Multi-omics Data [23.850817918011863]
We exploit causal discovery algorithms to investigate how perturbations in the genome can affect the survival of patients diagnosed with breast cancer.
Our findings reveal important factors related to the vital status of patients using causal discovery algorithms.
Results are validated through language models trained on biomedical literature.
arXiv Detail & Related papers (2023-05-28T17:07:46Z) - The Impact of Missing Data on Causal Discovery: A Multicentric Clinical
Study [1.173358409934101]
We use data from a multi-centric study on endometrial cancer to analyze the impact of different missingness mechanisms on the recovered causal graph.
We validate the recovered graph with expert physicians, showing that our approach finds clinically-relevant solutions.
arXiv Detail & Related papers (2023-05-17T08:46:30Z) - Optimizing Data-driven Causal Discovery Using Knowledge-guided Search [3.7489744097107316]
This study introduces a knowledge-guided causal structure search (KGS) approach that utilizes observational data and structural priors as constraints to learn the causal graph.
We extensively evaluate KGS in multiple settings using synthetic and benchmark real-world datasets, as well as in a real-life healthcare application related to oxygen therapy treatment.
arXiv Detail & Related papers (2023-04-11T20:56:33Z) - DOMINO: Visual Causal Reasoning with Time-Dependent Phenomena [59.291745595756346]
We propose a set of visual analytics methods that allow humans to participate in the discovery of causal relations associated with windows of time delay.
Specifically, we leverage a well-established method, logic-based causality, to enable analysts to test the significance of potential causes.
Since an effect can be a cause of other effects, we allow users to aggregate different temporal cause-effect relations found with our method into a visual flow diagram.
arXiv Detail & Related papers (2023-03-12T03:40:21Z) - Intelligent Sight and Sound: A Chronic Cancer Pain Dataset [74.77784420691937]
This paper introduces the first chronic cancer pain dataset, collected as part of the Intelligent Sight and Sound (ISS) clinical trial.
The data collected to date consists of 29 patients, 509 smartphone videos, 189,999 frames, and self-reported affective and activity pain scores.
Using static images and multi-modal data to predict self-reported pain levels, early models show significant gaps between current methods available to predict pain.
arXiv Detail & Related papers (2022-04-07T22:14:37Z) - Why Interpretable Causal Inference is Important for High-Stakes Decision
Making for Critically Ill Patients and How To Do It [80.24494623756839]
We present a framework for interpretable estimation of causal effects for critically ill patients.
We apply this framework to the effect of seizures and other potentially harmful electrical events in the brain on outcomes.
arXiv Detail & Related papers (2022-03-09T18:03:35Z) - An introduction to causal reasoning in health analytics [2.199093822766999]
We will try to highlight some of the drawbacks that may arise in traditional machine learning and statistical approaches to analyze the observational data.
We will demonstrate the applications of causal inference in tackling some common machine learning issues.
arXiv Detail & Related papers (2021-05-10T20:25:56Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - HINT: Hierarchical Interaction Network for Trial Outcome Prediction
Leveraging Web Data [56.53715632642495]
Clinical trials face uncertain outcomes due to issues with efficacy, safety, or problems with patient recruitment.
In this paper, we propose Hierarchical INteraction Network (HINT) for more general, clinical trial outcome predictions.
arXiv Detail & Related papers (2021-02-08T15:09:07Z) - Estimation of Causal Effects in the Presence of Unobserved Confounding
in the Alzheimer's Continuum [3.2489082010225494]
We derive a causal graph from the current clinical knowledge on cause and effect in the Alzheimer's disease continuum.
We show that identifiability of the causal effect requires all confounders to be known and measured.
In our theoretical analysis, we prove that using the substitute confounder enables identifiability of the causal effect of neuroanatomy on cognition.
arXiv Detail & Related papers (2020-06-23T16:29:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.