Related papers: Investigating Alternative Feature Extraction Pipelines For Clinical Note Phenotyping

Investigating Alternative Feature Extraction Pipelines For Clinical Note Phenotyping

URL: http://arxiv.org/abs/2310.03772v1
Date: Thu, 5 Oct 2023 02:51:51 GMT
Title: Investigating Alternative Feature Extraction Pipelines For Clinical Note Phenotyping
Authors: Neil Daniel
Abstract summary: Using computational systems for the extraction of medical attributes offers many applications. BERT-based models can be used to transform clinical notes into a series of representations. We propose an alternative pipeline utilizing ScispaCyNeumann for extraction of common diseases.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: A common practice in the medical industry is the use of clinical notes, which consist of detailed patient observations. However, electronic health record systems frequently do not contain these observations in a structured format, rendering patient information challenging to assess and evaluate automatically. Using computational systems for the extraction of medical attributes offers many applications, including longitudinal analysis of patients, risk assessment, and hospital evaluation. Recent work has constructed successful methods for phenotyping: extracting medical attributes from clinical notes. BERT-based models can be used to transform clinical notes into a series of representations, which are then condensed into a single document representation based on their CLS embeddings and passed into an LSTM (Mulyar et al., 2020). Though this pipeline yields a considerable performance improvement over previous results, it requires extensive convergence time. This method also does not allow for predicting attributes not yet identified in clinical notes. Considering the wide variety of medical attributes that may be present in a clinical note, we propose an alternative pipeline utilizing ScispaCy (Neumann et al., 2019) for the extraction of common diseases. We then train various supervised learning models to associate the presence of these conditions with patient attributes. Finally, we replicate a ClinicalBERT (Alsentzer et al., 2019) and LSTM-based approach for purposes of comparison. We find that alternative methods moderately underperform the replicated LSTM approach. Yet, considering a complex tradeoff between accuracy and runtime, in addition to the fact that the alternative approach also allows for the detection of medical conditions that are not already present in a clinical note, its usage may be considered as a supplement to established methods.

Related papers

TrialMatchAI: An End-to-End AI-powered Clinical Trial Recommendation System to Streamline Patient-to-Trial Matching [0.0]
We present TrialMatchAI, an AI-powered recommendation system that automates patient-to-trial matching.<n>Built on fine-tuned, open-source large language models, TrialMatchAI ensures transparency and maintains a lightweight deployment footprint.<n>In real-world validation, 92 percent of oncology patients had at least one relevant trial retrieved within the top 20 recommendations.
arXiv Detail & Related papers (2025-05-13T12:39:06Z)
SNOBERT: A Benchmark for clinical notes entity linking in the SNOMED CT clinical terminology [43.89160296332471]
We propose a method for linking text spans in clinical notes to specific concepts in the SNOMED CT using BERT-based models. The method consists of two stages: candidate selection and candidate matching. The models were trained on one of the largest publicly available dataset of labeled clinical notes.
arXiv Detail & Related papers (2024-05-25T08:00:44Z)
TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment. In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials. We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z)
Hierarchical Pretraining for Biomedical Term Embeddings [4.69793648771741]
We propose HiPrBERT, a novel biomedical term representation model trained on hierarchical data. We show that HiPrBERT effectively learns the pair-wise distance from hierarchical information, resulting in a substantially more informative embeddings for further biomedical applications.
arXiv Detail & Related papers (2023-07-01T08:16:00Z)
SPeC: A Soft Prompt-Based Calibration on Performance Variability of Large Language Model in Clinical Notes Summarization [50.01382938451978]
We introduce a model-agnostic pipeline that employs soft prompts to diminish variance while preserving the advantages of prompt-based summarization. Experimental findings indicate that our method not only bolsters performance but also effectively curbs variance for various language models.
arXiv Detail & Related papers (2023-03-23T04:47:46Z)
Time Associated Meta Learning for Clinical Prediction [78.99422473394029]
We propose a novel time associated meta learning (TAML) method to make effective predictions at multiple future time points. To address the sparsity problem after task splitting, TAML employs a temporal information sharing strategy to augment the number of positive samples. We demonstrate the effectiveness of TAML on multiple clinical datasets, where it consistently outperforms a range of strong baselines.
arXiv Detail & Related papers (2023-03-05T03:54:54Z)
Informing clinical assessment by contextualizing post-hoc explanations of risk prediction models in type-2 diabetes [50.8044927215346]
We consider a comorbidity risk prediction scenario and focus on contexts regarding the patients clinical state. We employ several state-of-the-art LLMs to present contexts around risk prediction model inferences and evaluate their acceptability. Our paper is one of the first end-to-end analyses identifying the feasibility and benefits of contextual explanations in a real-world clinical use case.
arXiv Detail & Related papers (2023-02-11T18:07:11Z)
This Patient Looks Like That Patient: Prototypical Networks for Interpretable Diagnosis Prediction from Clinical Text [56.32427751440426]
In clinical practice such models must not only be accurate, but provide doctors with interpretable and helpful results. We introduce ProtoPatient, a novel method based on prototypical networks and label-wise attention. We evaluate the model on two publicly available clinical datasets and show that it outperforms existing baselines.
arXiv Detail & Related papers (2022-10-16T10:12:07Z)
A Multimodal Transformer: Fusing Clinical Notes with Structured EHR Data for Interpretable In-Hospital Mortality Prediction [8.625186194860696]
We provide a novel multimodal transformer to fuse clinical notes and structured EHR data for better prediction of in-hospital mortality. To improve interpretability, we propose an integrated gradients (IG) method to select important words in clinical notes. We also investigate the significance of domain adaptive pretraining and task adaptive fine-tuning on the Clinical BERT.
arXiv Detail & Related papers (2022-08-09T03:49:52Z)
Cross-Lingual Knowledge Transfer for Clinical Phenotyping [55.92262310716537]
We investigate cross-lingual knowledge transfer strategies to execute this task for clinics that do not use the English language. We evaluate these strategies for a Greek and a Spanish clinic leveraging clinical notes from different clinical domains. Our results show that using multilingual data overall improves clinical phenotyping models and can compensate for data sparseness.
arXiv Detail & Related papers (2022-08-03T08:33:21Z)
Assessing mortality prediction through different representation models based on concepts extracted from clinical notes [2.707154152696381]
Learning of embedding is a method for converting notes into a format that makes them comparable. Transformer-based representation models have recently made a great leap forward. We performed experiments to measure the usefulness of the learned embedding vectors in the task of hospital mortality prediction.
arXiv Detail & Related papers (2022-07-22T04:34:33Z)
Literature-Augmented Clinical Outcome Prediction [10.46990394710927]
We introduce techniques to help bridge this gap between EBM and AI-based clinical models. We propose a novel system that automatically retrieves patient-specific literature based on intensive care (ICU) patient information. Our model is able to substantially boost predictive accuracy on three challenging tasks in comparison to strong recent baselines.
arXiv Detail & Related papers (2021-11-16T11:19:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.