Optimizing Data-driven Causal Discovery Using Knowledge-guided Search
- URL: http://arxiv.org/abs/2304.05493v2
- Date: Mon, 8 Jul 2024 14:54:48 GMT
- Title: Optimizing Data-driven Causal Discovery Using Knowledge-guided Search
- Authors: Uzma Hasan, Md Osman Gani,
- Abstract summary: This study introduces a knowledge-guided causal structure search (KGS) approach that utilizes observational data and structural priors as constraints to learn the causal graph.
We extensively evaluate KGS in multiple settings using synthetic and benchmark real-world datasets, as well as in a real-life healthcare application related to oxygen therapy treatment.
- Score: 3.7489744097107316
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning causal relationships solely from observational data often fails to reveal the underlying causal mechanisms due to the vast search space of possible causal graphs, which can grow exponentially, especially for greedy algorithms using score-based approaches. Leveraging prior causal information, such as the presence or absence of causal edges, can help restrict and guide the score-based discovery process, leading to a more accurate search. In the healthcare domain, prior knowledge is abundant from sources like medical journals, electronic health records (EHRs), and clinical intervention outcomes. This study introduces a knowledge-guided causal structure search (KGS) approach that utilizes observational data and structural priors (such as causal edges) as constraints to learn the causal graph. KGS leverages prior edge information between variables, including the presence of a directed edge, the absence of an edge, and the presence of an undirected edge. We extensively evaluate KGS in multiple settings using synthetic and benchmark real-world datasets, as well as in a real-life healthcare application related to oxygen therapy treatment. To obtain causal priors, we use GPT-4 to retrieve relevant literature information. Our results show that structural priors of any type and amount enhance the search process, improving performance and optimizing causal discovery. This guided strategy ensures that the discovered edges align with established causal knowledge, enhancing the trustworthiness of findings while expediting the search process. It also enables a more focused exploration of causal mechanisms, potentially leading to more effective and personalized healthcare solutions.
Related papers
- CAnDOIT: Causal Discovery with Observational and Interventional Data from Time-Series [4.008958683836471]
CAnDOIT is a causal discovery method to reconstruct causal models using both observational and interventional data.
The use of interventional data in the causal analysis is crucial for real-world applications, such as robotics.
A Python implementation of CAnDOIT has also been developed and is publicly available on GitHub.
arXiv Detail & Related papers (2024-10-03T13:57:08Z) - Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization.
We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data.
We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z) - CORE: Towards Scalable and Efficient Causal Discovery with Reinforcement
Learning [2.7446241148152253]
CORE is a reinforcement learning-based approach for causal discovery and intervention planning.
Our results demonstrate that CORE generalizes to unseen graphs and efficiently uncovers causal structures.
CORE scales to larger graphs with up to 10 variables and outperforms existing approaches in structure estimation accuracy and sample efficiency.
arXiv Detail & Related papers (2024-01-30T12:57:52Z) - Applying Large Language Models for Causal Structure Learning in Non
Small Cell Lung Cancer [8.248361703850774]
Causal discovery is becoming a key part in medical AI research.
In this paper, we investigate applying Large Language Models to the problem of determining the directionality of edges in causal discovery.
Our result shows that LLMs can accurately predict the directionality of edges in causal graphs, outperforming existing state-of-the-art methods.
arXiv Detail & Related papers (2023-11-13T09:31:14Z) - Discovering Dynamic Causal Space for DAG Structure Learning [64.763763417533]
We propose a dynamic causal space for DAG structure learning, coined CASPER.
It integrates the graph structure into the score function as a new measure in the causal space to faithfully reflect the causal distance between estimated and ground truth DAG.
arXiv Detail & Related papers (2023-06-05T12:20:40Z) - Deep Reinforcement Learning Framework for Thoracic Diseases
Classification via Prior Knowledge Guidance [49.87607548975686]
The scarcity of labeled data for related diseases poses a huge challenge to an accurate diagnosis.
We propose a novel deep reinforcement learning framework, which introduces prior knowledge to direct the learning of diagnostic agents.
Our approach's performance was demonstrated using the well-known NIHX-ray 14 and CheXpert datasets.
arXiv Detail & Related papers (2023-06-02T01:46:31Z) - The Impact of Missing Data on Causal Discovery: A Multicentric Clinical
Study [1.173358409934101]
We use data from a multi-centric study on endometrial cancer to analyze the impact of different missingness mechanisms on the recovered causal graph.
We validate the recovered graph with expert physicians, showing that our approach finds clinically-relevant solutions.
arXiv Detail & Related papers (2023-05-17T08:46:30Z) - Learning domain-specific causal discovery from time series [7.298647409503783]
Causal discovery from time-varying data is important in neuroscience, medicine, and machine learning.
Human expertise is often not entirely accurate and tends to be outperformed in domains with abundant data.
In this study, we examine whether we can enhance domain-specific causal discovery for time series using a data-driven approach.
arXiv Detail & Related papers (2022-09-12T20:32:39Z) - Effect Identification in Cluster Causal Diagrams [51.42809552422494]
We introduce a new type of graphical model called cluster causal diagrams (for short, C-DAGs)
C-DAGs allow for the partial specification of relationships among variables based on limited prior knowledge.
We develop the foundations and machinery for valid causal inferences over C-DAGs.
arXiv Detail & Related papers (2022-02-22T21:27:31Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Exploring and Distilling Posterior and Prior Knowledge for Radiology
Report Generation [55.00308939833555]
The PPKED includes three modules: Posterior Knowledge Explorer (PoKE), Prior Knowledge Explorer (PrKE) and Multi-domain Knowledge Distiller (MKD)
PoKE explores the posterior knowledge, which provides explicit abnormal visual regions to alleviate visual data bias.
PrKE explores the prior knowledge from the prior medical knowledge graph (medical knowledge) and prior radiology reports (working experience) to alleviate textual data bias.
arXiv Detail & Related papers (2021-06-13T11:10:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.