Natural Language Processing Methods to Identify Oncology Patients at
High Risk for Acute Care with Clinical Notes
- URL: http://arxiv.org/abs/2209.13860v1
- Date: Wed, 28 Sep 2022 06:31:19 GMT
- Title: Natural Language Processing Methods to Identify Oncology Patients at
High Risk for Acute Care with Clinical Notes
- Authors: Claudio Fanconi, Marieke van Buchem, Tina Hernandez-Boussard
- Abstract summary: This paper evaluates how natural language processing can be used to identify the risk of acute care use (ACU) in oncology patients.
Risk prediction using structured health data (SHD) is now standard, but predictions using free-text formats are complex.
- Score: 9.49721872804122
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Clinical notes are an essential component of a health record. This paper
evaluates how natural language processing (NLP) can be used to identify the
risk of acute care use (ACU) in oncology patients, once chemotherapy starts.
Risk prediction using structured health data (SHD) is now standard, but
predictions using free-text formats are complex. This paper explores the use of
free-text notes for the prediction of ACU instead of SHD. Deep Learning models
were compared to manually engineered language features. Results show that SHD
models minimally outperform NLP models; an l1-penalised logistic regression
with SHD achieved a C-statistic of 0.748 (95%-CI: 0.735, 0.762), while the same
model with language features achieved 0.730 (95%-CI: 0.717, 0.745) and a
transformer-based model achieved 0.702 (95%-CI: 0.688, 0.717). This paper shows
how language models can be used in clinical applications and underlines how
risk bias is different for diverse patient groups, even using only free-text
data.
Related papers
- Leveraging Prompt-Learning for Structured Information Extraction from Crohn's Disease Radiology Reports in a Low-Resource Language [11.688665498310405]
SMP-BERT is a novel prompt learning method for automatically converting free-text radiology reports into structured data.
In our studies, SMP-BERT greatly surpassed traditional fine-tuning methods in performance, notably in detecting infrequent conditions.
arXiv Detail & Related papers (2024-05-02T19:11:54Z) - Leveraging deep active learning to identify low-resource mobility
functioning information in public clinical notes [0.157286095422595]
First public annotated dataset specifically on the Mobility domain of the International Classification of Functioning, Disability and Health (ICF)
We utilize the National NLP Clinical Challenges (n2c2) research dataset to construct a pool of candidate sentences using keyword expansion.
Our final dataset consists of 4,265 sentences with a total of 11,784 entities, including 5,511 Action entities, 5,328 Mobility entities, 306 Assistance entities, and 639 Quantification entities.
arXiv Detail & Related papers (2023-11-27T15:53:11Z) - Large Language Models to Identify Social Determinants of Health in
Electronic Health Records [2.168737004368243]
Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHRs)
This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented.
800 patient notes were annotated for SDoH categories, and several transformer-based models were evaluated.
arXiv Detail & Related papers (2023-08-11T19:18:35Z) - TREEMENT: Interpretable Patient-Trial Matching via Personalized Dynamic
Tree-Based Memory Network [54.332862955411656]
Clinical trials are critical for drug development but often suffer from expensive and inefficient patient recruitment.
In recent years, machine learning models have been proposed for speeding up patient recruitment via automatically matching patients with clinical trials.
We introduce a dynamic tree-based memory network model named TREEMENT to provide accurate and interpretable patient trial matching.
arXiv Detail & Related papers (2023-07-19T12:35:09Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Self-supervised contrastive learning of echocardiogram videos enables
label-efficient cardiac disease diagnosis [48.64462717254158]
We developed a self-supervised contrastive learning approach, EchoCLR, to catered to echocardiogram videos.
When fine-tuned on small portions of labeled data, EchoCLR pretraining significantly improved classification performance for left ventricular hypertrophy (LVH) and aortic stenosis (AS)
EchoCLR is unique in its ability to learn representations of medical videos and demonstrates that SSL can enable label-efficient disease classification from small, labeled datasets.
arXiv Detail & Related papers (2022-07-23T19:17:26Z) - Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of
Code-Mixed Clinical Texts [56.72488923420374]
Pre-trained language models (LMs) have shown great potential for cross-lingual transfer in low-resource settings.
We show the few-shot cross-lingual transfer property of LMs for named recognition (NER) and apply it to solve a low-resource and real-world challenge of code-mixed (Spanish-Catalan) clinical notes de-identification in the stroke.
arXiv Detail & Related papers (2022-04-10T21:46:52Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - UNITE: Uncertainty-based Health Risk Prediction Leveraging Multi-sourced
Data [81.00385374948125]
We present UNcertaInTy-based hEalth risk prediction (UNITE) model.
UNITE provides accurate disease risk prediction and uncertainty estimation leveraging multi-sourced health data.
We evaluate UNITE on real-world disease risk prediction tasks: nonalcoholic fatty liver disease (NASH) and Alzheimer's disease (AD)
UNITE achieves up to 0.841 in F1 score for AD detection, up to 0.609 in PR-AUC for NASH detection, and outperforms various state-of-the-art baselines by up to $19%$ over the best baseline.
arXiv Detail & Related papers (2020-10-22T02:28:11Z) - Natural Language Processing with Deep Learning for Medical Adverse Event
Detection from Free-Text Medical Narratives: A Case Study of Detecting Total
Hip Replacement Dislocation [0.0]
We propose deep learning based NLP (DL-NLP) models for efficient and accurate hip dislocation AE detection following total hip replacement.
We benchmarked these proposed models with a wide variety of traditional machine learning based NLP (ML-NLP) models.
All DL-NLP models out-performed all of the ML-NLP models, with a convolutional neural network (CNN) model achieving the best overall performance.
arXiv Detail & Related papers (2020-04-17T16:25:36Z) - Med7: a transferable clinical natural language processing model for
electronic health records [6.935142529928062]
We introduce a named-entity recognition model for clinical natural language processing.
The model is trained to recognise seven categories: drug names, route, frequency, dosage, strength, form, duration.
We evaluate the transferability of the developed model using the data from the Intensive Care Unit in the US to secondary care mental health records (CRIS) in the UK.
arXiv Detail & Related papers (2020-03-03T00:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.