Hybrid deep learning methods for phenotype prediction from clinical
notes
- URL: http://arxiv.org/abs/2108.10682v1
- Date: Mon, 16 Aug 2021 05:57:28 GMT
- Title: Hybrid deep learning methods for phenotype prediction from clinical
notes
- Authors: Sahar Khalafi, Nasser Ghadiri and Milad Moradi
- Abstract summary: This paper proposes a novel hybrid model for automatically extracting patient phenotypes using natural language processing and deep learning models.
The proposed hybrid model is based on a neural bidirectional sequence model (BiLSTM or BiGRU) and a Convolutional Neural Network (CNN) for identifying patient's phenotypes in discharge reports.
- Score: 4.866431869728018
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Identifying patient cohorts from clinical notes in secondary electronic
health records is a fundamental task in clinical information management. The
patient cohort identification needs to identify the patient phenotypes.
However, with the growing number of clinical notes, it becomes challenging to
analyze the data manually. Therefore, automatic extraction of clinical concepts
would be an essential task to identify the patient phenotypes correctly. This
paper proposes a novel hybrid model for automatically extracting patient
phenotypes using natural language processing and deep learning models to
determine the patient phenotypes without dictionaries and human intervention.
The proposed hybrid model is based on a neural bidirectional sequence model
(BiLSTM or BiGRU) and a Convolutional Neural Network (CNN) for identifying
patient's phenotypes in discharge reports. Furthermore, to extract more
features related to each phenotype, an extra CNN layer is run parallel to the
hybrid proposed model. We used pre-trained embeddings such as FastText and
Word2vec separately as the input layers to evaluate other embedding's
performance in identifying patient phenotypes. We also measured the effect of
applying additional data cleaning steps on discharge reports to identify
patient phenotypes by deep learning models. We used discharge reports in the
Medical Information Mart for Intensive Care III (MIMIC III) database.
Experimental results in internal comparison demonstrate significant performance
improvement over existing models. The enhanced model with an extra CNN layer
obtained a relatively higher F1-score than the original hybrid model.
Related papers
- TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - PheME: A deep ensemble framework for improving phenotype prediction from
multi-modal data [42.56953523499849]
We present PheME, an Ensemble framework using Multi-modality data of structured EHRs and unstructured clinical notes for accurate Phenotype prediction.
We leverage ensemble learning to combine outputs from single-modal models and multi-modal models to improve phenotype predictions.
arXiv Detail & Related papers (2023-03-19T23:41:04Z) - Textual Data Augmentation for Patient Outcomes Prediction [67.72545656557858]
We propose a novel data augmentation method to generate artificial clinical notes in patients' Electronic Health Records.
We fine-tune the generative language model GPT-2 to synthesize labeled text with the original training data.
We evaluate our method on the most common patient outcome, i.e., the 30-day readmission rate.
arXiv Detail & Related papers (2022-11-13T01:07:23Z) - A cost-based multi-layer network approach for the discovery of patient
phenotypes [2.816539638885011]
We propose a cost-based layer selector model for detecting phenotypes using a community detection approach.
Our goal is to minimize the number of features used to build these phenotypes while preserving its quality.
For some post-treatment variables, predictors using phenotypes from COBALT as features outperformed those using phenotypes detected by traditional clustering methods.
arXiv Detail & Related papers (2022-09-19T14:07:10Z) - Bridging the Gap Between Patient-specific and Patient-independent
Seizure Prediction via Knowledge Distillation [7.2666838978096875]
Existing approaches typically train models in a patient-specific fashion due to the highly personalized characteristics of epileptic signals.
A patient-specific model can then be obtained with the help of distilled knowledge and additional personalized data.
Five state-of-the-art seizure prediction methods are trained on the CHB-MIT sEEG database with our proposed scheme.
arXiv Detail & Related papers (2022-02-25T10:30:29Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Neural Language Models with Distant Supervision to Identify Major
Depressive Disorder from Clinical Notes [2.1060613825447407]
Major depressive disorder (MDD) is a prevalent psychiatric disorder that is associated with significant healthcare burden worldwide.
Recent advancements in neural language models, such as Bidirectional Representations for Transformers (BERT) model, resulted in state-of-the-art neural language models.
We propose to leverage the neural language models in a distant supervision paradigm to identify MDD phenotypes from clinical notes.
arXiv Detail & Related papers (2021-04-19T21:11:41Z) - G-MIND: An End-to-End Multimodal Imaging-Genetics Framework for
Biomarker Identification and Disease Classification [49.53651166356737]
We propose a novel deep neural network architecture to integrate imaging and genetics data, as guided by diagnosis, that provides interpretable biomarkers.
We have evaluated our model on a population study of schizophrenia that includes two functional MRI (fMRI) paradigms and Single Nucleotide Polymorphism (SNP) data.
arXiv Detail & Related papers (2021-01-27T19:28:04Z) - Select-ProtoNet: Learning to Select for Few-Shot Disease Subtype
Prediction [55.94378672172967]
We focus on few-shot disease subtype prediction problem, identifying subgroups of similar patients.
We introduce meta learning techniques to develop a new model, which can extract the common experience or knowledge from interrelated clinical tasks.
Our new model is built upon a carefully designed meta-learner, called Prototypical Network, that is a simple yet effective meta learning machine for few-shot image classification.
arXiv Detail & Related papers (2020-09-02T02:50:30Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z) - Towards Patient Record Summarization Through Joint Phenotype Learning in
HIV Patients [1.598617270887469]
We propose an unsupervised phenotyping approach that jointly learns a large number of phenotypes/problems across structured and unstructured data.
We ground our experiments in phenotyping patients from an HIV clinic in a large urban care institution.
We find that the learned phenotypes and their relatedness are clinically valid when assessed by clinical experts.
arXiv Detail & Related papers (2020-03-09T15:41:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.