Interpretable Disease Prediction based on Reinforcement Path Reasoning
over Knowledge Graphs
- URL: http://arxiv.org/abs/2010.08300v1
- Date: Fri, 16 Oct 2020 10:46:28 GMT
- Title: Interpretable Disease Prediction based on Reinforcement Path Reasoning
over Knowledge Graphs
- Authors: Zhoujian Sun, Wei Dong, Jinlong Shi and Zhengxing Huang
- Abstract summary: We formulated the disease prediction task as a random walk along a knowledge graph (KG)
We build a KG to record relationships between diseases and risk factors according to validated medical knowledge.
The trajectory generated by the object represents an interpretable disease progression path of the given patient.
- Score: 15.339137501579087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Objective: To combine medical knowledge and medical data to interpretably
predict the risk of disease. Methods: We formulated the disease prediction task
as a random walk along a knowledge graph (KG). Specifically, we build a KG to
record relationships between diseases and risk factors according to validated
medical knowledge. Then, a mathematical object walks along the KG. It starts
walking at a patient entity, which connects the KG based on the patient current
diseases or risk factors and stops at a disease entity, which represents the
predicted disease. The trajectory generated by the object represents an
interpretable disease progression path of the given patient. The dynamics of
the object are controlled by a policy-based reinforcement learning (RL) module,
which is trained by electronic health records (EHRs). Experiments: We utilized
two real-world EHR datasets to evaluate the performance of our model. In the
disease prediction task, our model achieves 0.743 and 0.639 in terms of macro
area under the curve (AUC) in predicting 53 circulation system diseases in the
two datasets, respectively. This performance is comparable to the commonly used
machine learning (ML) models in medical research. In qualitative analysis, our
clinical collaborator reviewed the disease progression paths generated by our
model and advocated their interpretability and reliability. Conclusion:
Experimental results validate the proposed model in interpretably evaluating
and optimizing disease prediction. Significance: Our work contributes to
leveraging the potential of medical knowledge and medical data jointly for
interpretable prediction tasks.
Related papers
- Reasoning-Enhanced Healthcare Predictions with Knowledge Graph Community Retrieval [61.70489848327436]
KARE is a novel framework that integrates knowledge graph (KG) community-level retrieval with large language models (LLMs) reasoning.
Extensive experiments demonstrate that KARE outperforms leading models by up to 10.8-15.0% on MIMIC-III and 12.6-12.7% on MIMIC-IV for mortality and readmission predictions.
arXiv Detail & Related papers (2024-10-06T18:46:28Z) - TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data [42.96821770394798]
TACCO is a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data.
We conduct experiments on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction.
In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO.
arXiv Detail & Related papers (2024-06-14T14:18:38Z) - MedDiffusion: Boosting Health Risk Prediction via Diffusion-based Data
Augmentation [58.93221876843639]
This paper introduces a novel, end-to-end diffusion-based risk prediction model, named MedDiffusion.
It enhances risk prediction performance by creating synthetic patient data during training to enlarge sample space.
It discerns hidden relationships between patient visits using a step-wise attention mechanism, enabling the model to automatically retain the most vital information for generating high-quality data.
arXiv Detail & Related papers (2023-10-04T01:36:30Z) - Label Dependent Attention Model for Disease Risk Prediction Using
Multimodal Electronic Health Records [8.854691034104071]
Disease risk prediction has attracted increasing attention in the field of modern healthcare.
One challenge of applying AI models for risk prediction lies in generating interpretable evidence.
We propose the method of jointly embedding words and labels.
arXiv Detail & Related papers (2022-01-18T07:21:20Z) - Towards Trustworthy Cross-patient Model Development [3.109478324371548]
We study differences in model performance and explainability when trained for all patients and one patient at a time.
The results show that patients' demographics has a large impact on the performance and explainability and thus trustworthiness.
arXiv Detail & Related papers (2021-12-20T10:51:04Z) - MIMO: Mutual Integration of Patient Journey and Medical Ontology for
Healthcare Representation Learning [49.57261599776167]
We propose an end-to-end robust Transformer-based solution, Mutual Integration of patient journey and Medical Ontology (MIMO) for healthcare representation learning and predictive analytics.
arXiv Detail & Related papers (2021-07-20T07:04:52Z) - Medical Profile Model: Scientific and Practical Applications in
Healthcare [1.718235998156457]
We present the patient histories as temporal sequences of diseases for which embeddings are learned in an unsupervised setup.
The embedding space includes demographic parameters which allow the creation of generalized patient profiles.
The training of such a medical profile model has been performed on a dataset of more than one million patients.
arXiv Detail & Related papers (2021-06-21T13:30:43Z) - Clinical Outcome Prediction from Admission Notes using Self-Supervised
Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks.
Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets.
We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Individualized Prediction of COVID-19 Adverse outcomes with MLHO [9.197411456718708]
We developed an end-to-end Machine Learning framework that leverages iterative feature and algorithm selection to predict Health outcomes.
We modeled the four adverse outcomes utilizing about 600 features representing patients' pre-COVID health records and demographics.
Our results demonstrated that while demographic variables are important predictors of adverse outcomes after a COVID-19 infection, the incorporation of the past clinical records are vital for a reliable prediction model.
arXiv Detail & Related papers (2020-08-10T02:44:52Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.