Phenotype Detection in Real World Data via Online MixEHR Algorithm
- URL: http://arxiv.org/abs/2211.07549v2
- Date: Tue, 15 Nov 2022 14:19:28 GMT
- Title: Phenotype Detection in Real World Data via Online MixEHR Algorithm
- Authors: Ying Xu, Romane Gauriau, Anna Decker, Jacob Oppenheim
- Abstract summary: We extended an unsupervised phenotyping algorithm, mixEHR, to an online version allowing us to use it on order of magnitude larger datasets.
In addition to recapitulating previously observed disease groups, we discovered clinically meaningful disease subtypes and comorbidities.
This work scaled up an effective unsupervised learning method, reinforced existing clinical knowledge, and is a promising approach for efficient collaboration with clinicians.
- Score: 9.385112439570412
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Understanding patterns of diagnoses, medications, procedures, and laboratory
tests from electronic health records (EHRs) and health insurer claims is
important for understanding disease risk and for efficient clinical
development, which often require rules-based curation in collaboration with
clinicians. We extended an unsupervised phenotyping algorithm, mixEHR, to an
online version allowing us to use it on order of magnitude larger datasets
including a large, US-based claims dataset and a rich regional EHR dataset. In
addition to recapitulating previously observed disease groups, we discovered
clinically meaningful disease subtypes and comorbidities. This work scaled up
an effective unsupervised learning method, reinforced existing clinical
knowledge, and is a promising approach for efficient collaboration with
clinicians.
Related papers
- TrialBench: Multi-Modal Artificial Intelligence-Ready Clinical Trial Datasets [57.067409211231244]
This paper presents meticulously curated AIready datasets covering multi-modal data (e.g., drug molecule, disease code, text, categorical/numerical features) and 8 crucial prediction challenges in clinical trial design.
We provide basic validation methods for each task to ensure the datasets' usability and reliability.
We anticipate that the availability of such open-access datasets will catalyze the development of advanced AI approaches for clinical trial design.
arXiv Detail & Related papers (2024-06-30T09:13:10Z) - medIKAL: Integrating Knowledge Graphs as Assistants of LLMs for Enhanced Clinical Diagnosis on EMRs [13.806201934732321]
medIKAL combines Large Language Models (LLMs) with knowledge graphs (KGs) to enhance diagnostic capabilities.
medIKAL assigns weighted importance to entities in medical records based on their type, enabling precise localization of candidate diseases within KGs.
We validated medIKAL's effectiveness through extensive experiments on a newly introduced open-sourced Chinese EMR dataset.
arXiv Detail & Related papers (2024-06-20T13:56:52Z) - TACCO: Task-guided Co-clustering of Clinical Concepts and Patient Visits for Disease Subtyping based on EHR Data [42.96821770394798]
TACCO is a novel framework that jointly discovers clusters of clinical concepts and patient visits based on a hypergraph modeling of EHR data.
We conduct experiments on the public MIMIC-III dataset and Emory internal CRADLE dataset over the downstream clinical tasks of phenotype classification and cardiovascular risk prediction.
In-depth model analysis, clustering results analysis, and clinical case studies further validate the improved utilities and insightful interpretations delivered by TACCO.
arXiv Detail & Related papers (2024-06-14T14:18:38Z) - REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records
Analysis via Large Language Models [19.62552013839689]
Existing models often lack the medical context relevent to clinical tasks, prompting the incorporation of external knowledge.
We propose REALM, a Retrieval-Augmented Generation (RAG) driven framework to enhance multimodal EHR representations.
Our experiments on MIMIC-III mortality and readmission tasks showcase the superior performance of our REALM framework over baselines.
arXiv Detail & Related papers (2024-02-10T18:27:28Z) - Enabling Collaborative Clinical Diagnosis of Infectious Keratitis by
Integrating Expert Knowledge and Interpretable Data-driven Intelligence [28.144658552047975]
This study investigates the performance, interpretability, and clinical utility of knowledge-guided diagnosis model (KGDM) in the diagnosis of infectious keratitis (IK)
The diagnostic odds ratios (DOR) of the interpreted AI-based biomarkers are effective, ranging from 3.011 to 35.233.
The participants with collaboration achieved a performance exceeding that of both humans and AI.
arXiv Detail & Related papers (2024-01-14T02:10:54Z) - Polar-Net: A Clinical-Friendly Model for Alzheimer's Disease Detection
in OCTA Images [53.235117594102675]
Optical Coherence Tomography Angiography is a promising tool for detecting Alzheimer's disease (AD) by imaging the retinal microvasculature.
We propose a novel deep-learning framework called Polar-Net to provide interpretable results and leverage clinical prior knowledge.
We show that Polar-Net outperforms existing state-of-the-art methods and provides more valuable pathological evidence for the association between retinal vascular changes and AD.
arXiv Detail & Related papers (2023-11-10T11:49:49Z) - Large Language Models for Healthcare Data Augmentation: An Example on
Patient-Trial Matching [49.78442796596806]
We propose an innovative privacy-aware data augmentation approach for patient-trial matching (LLM-PTM)
Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%.
arXiv Detail & Related papers (2023-03-24T03:14:00Z) - sEHR-CE: Language modelling of structured EHR data for efficient and
generalizable patient cohort expansion [0.0]
sEHR-CE is a novel framework based on transformers to enable integrated phenotyping and analyses of heterogeneous clinical datasets.
We validate our approach using primary and secondary care data from the UK Biobank, a large-scale research study.
arXiv Detail & Related papers (2022-11-30T16:00:43Z) - FLOP: Federated Learning on Medical Datasets using Partial Networks [84.54663831520853]
COVID-19 Disease due to the novel coronavirus has caused a shortage of medical resources.
Different data-driven deep learning models have been developed to mitigate the diagnosis of COVID-19.
The data itself is still scarce due to patient privacy concerns.
We propose a simple yet effective algorithm, named textbfFederated textbfL textbfon Medical datasets using textbfPartial Networks (FLOP)
arXiv Detail & Related papers (2021-02-10T01:56:58Z) - Inheritance-guided Hierarchical Assignment for Clinical Automatic
Diagnosis [50.15205065710629]
Clinical diagnosis, which aims to assign diagnosis codes for a patient based on the clinical note, plays an essential role in clinical decision-making.
We propose a novel framework to combine the inheritance-guided hierarchical assignment and co-occurrence graph propagation for clinical automatic diagnosis.
arXiv Detail & Related papers (2021-01-27T13:16:51Z) - Trajectories, bifurcations and pseudotime in large clinical datasets:
applications to myocardial infarction and diabetes data [94.37521840642141]
We suggest a semi-supervised methodology for the analysis of large clinical datasets, characterized by mixed data types and missing values.
The methodology is based on application of elastic principal graphs which can address simultaneously the tasks of dimensionality reduction, data visualization, clustering, feature selection and quantifying the geodesic distances (pseudotime) in partially ordered sequences of observations.
arXiv Detail & Related papers (2020-07-07T21:04:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.