KG-TREAT: Pre-training for Treatment Effect Estimation by Synergizing
Patient Data with Knowledge Graphs
- URL: http://arxiv.org/abs/2403.03791v1
- Date: Wed, 6 Mar 2024 15:37:22 GMT
- Title: KG-TREAT: Pre-training for Treatment Effect Estimation by Synergizing
Patient Data with Knowledge Graphs
- Authors: Ruoqi Liu, Lingfei Wu, Ping Zhang
- Abstract summary: KG-TREAT synergizes large-scale observational patient data with biomedical knowledge graphs to enhance treatment effect estimation.
KG-TREAT incorporates two pre-training tasks to ensure a thorough grounding and contextualization of patient data and KGs.
Evaluation on four downstream TEE tasks shows KG-TREAT's superiority over existing methods, with an average improvement of 7% in Area under the ROC Curve (AUC) and 9% in Influence-based Precision of Estimating Heterogeneous Effects (IF-PEHE)
- Score: 46.24838619931438
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Treatment effect estimation (TEE) is the task of determining the impact of
various treatments on patient outcomes. Current TEE methods fall short due to
reliance on limited labeled data and challenges posed by sparse and
high-dimensional observational patient data. To address the challenges, we
introduce a novel pre-training and fine-tuning framework, KG-TREAT, which
synergizes large-scale observational patient data with biomedical knowledge
graphs (KGs) to enhance TEE. Unlike previous approaches, KG-TREAT constructs
dual-focus KGs and integrates a deep bi-level attention synergy method for
in-depth information fusion, enabling distinct encoding of treatment-covariate
and outcome-covariate relationships. KG-TREAT also incorporates two
pre-training tasks to ensure a thorough grounding and contextualization of
patient data and KGs. Evaluation on four downstream TEE tasks shows KG-TREAT's
superiority over existing methods, with an average improvement of 7% in Area
under the ROC Curve (AUC) and 9% in Influence Function-based Precision of
Estimating Heterogeneous Effects (IF-PEHE). The effectiveness of our estimated
treatment effects is further affirmed by alignment with established randomized
clinical trial findings.
Related papers
- Evaluation of the impact of expert knowledge: How decision support scores impact the effectiveness of automatic knowledge-driven feature engineering (aKDFE) [0.8272083537040182]
Adverse Drug Events (ADEs) pose significant healthcare challenges, impacting patient safety and costs.
This study evaluates automatic Knowledge-Driven Feature Engineering (aKDFE) for improved ADE prediction from Electronic Health Record (EHR) data.
We investigated how incorporating domain-specific ADE risk scores for prolonged heart QT interval affects prediction performance using EHR data and medication handling events.
arXiv Detail & Related papers (2025-04-08T11:34:38Z) - Individualised Treatment Effects Estimation with Composite Treatments and Composite Outcomes [13.925793826373706]
Estimating individualised treatment effect (ITE) remains a fundamental problem in causal inference.
Previous work in causal machine learning for ITE estimation is limited to simple settings, like single treatments and single outcomes.
We propose a novel and innovative hypernetwork-based approach, called emphH-Learner, to solve ITE estimation under composite treatments and composite outcomes.
arXiv Detail & Related papers (2025-02-12T10:41:21Z) - Explainable AI for Classifying UTI Risk Groups Using a Real-World Linked EHR and Pathology Lab Dataset [0.47517735516852333]
We leverage a linked EHR dataset to characterize urinary tract infections (UTIs) in Bristol, North Somerset, and South Gloucestershire, UK.
A comprehensive data pre-processing and curation pipeline transforms the raw EHR data into a structured format suitable for AI modeling.
We build a UTI risk estimation framework informed by clinical expertise to estimate UTI risk across individual patient timelines.
arXiv Detail & Related papers (2024-11-26T18:10:51Z) - Learning to Denoise Biomedical Knowledge Graph for Robust Molecular Interaction Prediction [50.7901190642594]
We propose BioKDN (Biomedical Knowledge Graph Denoising Network) for robust molecular interaction prediction.
BioKDN refines the reliable structure of local subgraphs by denoising noisy links in a learnable manner.
It maintains consistent and robust semantics by smoothing relations around the target interaction.
arXiv Detail & Related papers (2023-12-09T07:08:00Z) - LATTE: Label-efficient Incident Phenotyping from Longitudinal Electronic
Health Records [11.408950540503112]
We propose a LAbel-efficienT incidenT phEnotyping algorithm to accurately annotate the timing of clinical events from longitudinal EHR data.
LATTE is evaluated on three analyses: the onset of type-2 diabetes, heart failure, and the onset and relapses of multiple sclerosis.
arXiv Detail & Related papers (2023-05-19T03:28:51Z) - FineEHR: Refine Clinical Note Representations to Improve Mortality
Prediction [3.9026461169566673]
Large-scale electronic health records provide machine learning models with an abundance of clinical text and vital sign data.
Despite the emergence of advanced Natural Language Processing (NLP) algorithms for clinical note analysis, the complex textual structure and noise present in raw clinical data have posed significant challenges.
We propose FINEEHR, a system that utilizes two representation learning techniques, namely metric learning and fine-tuning, to refine clinical note embeddings.
arXiv Detail & Related papers (2023-04-24T02:42:52Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Assessment of Treatment Effect Estimators for Heavy-Tailed Data [70.72363097550483]
A central obstacle in the objective assessment of treatment effect (TE) estimators in randomized control trials (RCTs) is the lack of ground truth (or validation set) to test their performance.
We provide a novel cross-validation-like methodology to address this challenge.
We evaluate our methodology across 709 RCTs implemented in the Amazon supply chain.
arXiv Detail & Related papers (2021-12-14T17:53:01Z) - A causal learning framework for the analysis and interpretation of
COVID-19 clinical data [7.256237785391623]
The workflow consists in a multi-step approach that goes from identifying the main causes of patient's outcome through BSL.
We evaluate our approach on a feature-rich COVID-19 dataset, showing that the proposed framework provides a schematic overview of the multi-factorial processes that jointly contribute to the outcome.
Our approach yields to a highly interpretable tool correctly predicting the outcome of 85% of subjects based exclusively on 3 features.
arXiv Detail & Related papers (2021-05-14T15:58:18Z) - HINT: Hierarchical Interaction Network for Trial Outcome Prediction
Leveraging Web Data [56.53715632642495]
Clinical trials face uncertain outcomes due to issues with efficacy, safety, or problems with patient recruitment.
In this paper, we propose Hierarchical INteraction Network (HINT) for more general, clinical trial outcome predictions.
arXiv Detail & Related papers (2021-02-08T15:09:07Z) - Estimating Individual Treatment Effects with Time-Varying Confounders [9.784193264717098]
Estimating individual treatment effect (ITE) from observational data is meaningful and practical in healthcare.
Existing work mainly relies on the strong ignorability assumption that no hidden confounders exist.
We propose Deep Sequential Weighting (DSW) for estimating ITE with time-varying confounders.
arXiv Detail & Related papers (2020-08-27T02:21:56Z) - Learning Decomposed Representation for Counterfactual Inference [53.36586760485262]
The fundamental problem in treatment effect estimation from observational data is confounder identification and balancing.
Most of the previous methods realized confounder balancing by treating all observed pre-treatment variables as confounders, ignoring further identifying confounders and non-confounders.
We propose a synergistic learning framework to 1) identify confounders by learning representations of both confounders and non-confounders, 2) balance confounder with sample re-weighting technique, and simultaneously 3) estimate the treatment effect in observational studies via counterfactual inference.
arXiv Detail & Related papers (2020-06-12T09:50:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.