A Comparative Evaluation Of Transformer Models For De-Identification Of
Clinical Text Data
- URL: http://arxiv.org/abs/2204.07056v1
- Date: Fri, 25 Mar 2022 19:42:03 GMT
- Title: A Comparative Evaluation Of Transformer Models For De-Identification Of
Clinical Text Data
- Authors: Christopher Meaney, Wali Hakimpour, Sumeet Kalia, Rahim Moineddin
- Abstract summary: The i2b2/UTHealth 2014 clinical text de-identification challenge corpus contains N=1304 clinical notes.
We fine-tune several transformer model architectures on the corpus, including: BERT-base, BERT-large, ROBERTA-base, ROBERTA-large, ALBERT-base and ALBERT-xxlarge.
We assess model performance in terms of accuracy, precision (positive predictive value), recall (sensitivity) and F1 score.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Objective: To comparatively evaluate several transformer model architectures
at identifying protected health information (PHI) in the i2b2/UTHealth 2014
clinical text de-identification challenge corpus.
Methods: The i2b2/UTHealth 2014 corpus contains N=1304 clinical notes
obtained from N=296 patients. Using a transfer learning framework, we fine-tune
several transformer model architectures on the corpus, including: BERT-base,
BERT-large, ROBERTA-base, ROBERTA-large, ALBERT-base and ALBERT-xxlarge. During
fine-tuning we vary the following model hyper-parameters: batch size, number
training epochs, learning rate and weight decay. We fine tune models on a
training data set, we evaluate and select optimally performing models on an
independent validation dataset, and lastly assess generalization performance on
a held-out test dataset. We assess model performance in terms of accuracy,
precision (positive predictive value), recall (sensitivity) and F1 score
(harmonic mean of precision and recall). We are interested in overall model
performance (PHI identified vs. PHI not identified), as well as PHI-specific
model performance.
Results: We observe that the ROBERTA-large models perform best at identifying
PHI in the i2b2/UTHealth 2014 corpus, achieving >99% overall accuracy and 96.7%
recall/precision on the heldout test corpus. Performance was good across many
PHI classes; however, accuracy/precision/recall decreased for identification of
the following entity classes: professions, organizations, ages, and certain
locations.
Conclusions: Transformers are a promising model class/architecture for
clinical text de-identification. With minimal hyper-parameter tuning
transformers afford researchers/clinicians the opportunity to obtain (near)
state-of-the-art performance.
Related papers
- The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - Large Language Models to Identify Social Determinants of Health in
Electronic Health Records [2.168737004368243]
Social determinants of health (SDoH) have an important impact on patient outcomes but are incompletely collected from the electronic health records (EHRs)
This study researched the ability of large language models to extract SDoH from free text in EHRs, where they are most commonly documented.
800 patient notes were annotated for SDoH categories, and several transformer-based models were evaluated.
arXiv Detail & Related papers (2023-08-11T19:18:35Z) - Comparative Analysis of Epileptic Seizure Prediction: Exploring Diverse
Pre-Processing Techniques and Machine Learning Models [0.0]
We present a comparative analysis of five machine learning models for the prediction of epileptic seizures using EEG data.
The results of our analysis demonstrate the performance of each model in terms of accuracy.
The ET model exhibited the best performance with an accuracy of 99.29%.
arXiv Detail & Related papers (2023-08-06T08:50:08Z) - Preserving Knowledge Invariance: Rethinking Robustness Evaluation of
Open Information Extraction [50.62245481416744]
We present the first benchmark that simulates the evaluation of open information extraction models in the real world.
We design and annotate a large-scale testbed in which each example is a knowledge-invariant clique.
By further elaborating the robustness metric, a model is judged to be robust if its performance is consistently accurate on the overall cliques.
arXiv Detail & Related papers (2023-05-23T12:05:09Z) - Revisiting the Evaluation of Image Synthesis with GANs [55.72247435112475]
This study presents an empirical investigation into the evaluation of synthesis performance, with generative adversarial networks (GANs) as a representative of generative models.
In particular, we make in-depth analyses of various factors, including how to represent a data point in the representation space, how to calculate a fair distance using selected samples, and how many instances to use from each set.
arXiv Detail & Related papers (2023-04-04T17:54:32Z) - Clinical Deterioration Prediction in Brazilian Hospitals Based on
Artificial Neural Networks and Tree Decision Models [56.93322937189087]
An extremely boosted neural network (XBNet) is used to predict clinical deterioration (CD)
The XGBoost model obtained the best results in predicting CD among Brazilian hospitals' data.
arXiv Detail & Related papers (2022-12-17T23:29:14Z) - Clinical-Longformer and Clinical-BigBird: Transformers for long clinical
sequences [4.196346055173027]
Transformers-based models, such as BERT, have dramatically improved the performance for various natural language processing tasks.
One of the core limitations of these transformers is the substantial memory consumption due to their full self-attention mechanism.
We introduce two domain enriched language models, namely Clinical-Longformer and Clinical-BigBird, which are pre-trained from large-scale clinical corpora.
arXiv Detail & Related papers (2022-01-27T22:51:58Z) - Deep Learning Models for Knowledge Tracing: Review and Empirical
Evaluation [2.423547527175807]
We review and evaluate a body of deep learning knowledge tracing (DLKT) models with openly available and widely-used data sets.
The evaluated DLKT models have been reimplemented for assessing and replicability of previously reported results.
arXiv Detail & Related papers (2021-12-30T14:19:27Z) - Development of patients triage algorithm from nationwide COVID-19
registry data based on machine learning [1.0323063834827415]
This paper provides the development processes of the severity assessment model using machine learning techniques.
Model only requires basic patients' basic personal data, allowing for them to judge their own severity.
We aim to establish a medical system that allows patients to check their own severity and informs them to visit the appropriate clinic center based on the past treatment details of other patients with similar severity.
arXiv Detail & Related papers (2021-09-18T19:56:27Z) - Comparing Test Sets with Item Response Theory [53.755064720563]
We evaluate 29 datasets using predictions from 18 pretrained Transformer models on individual test examples.
We find that Quoref, HellaSwag, and MC-TACO are best suited for distinguishing among state-of-the-art models.
We also observe span selection task format, which is used for QA datasets like QAMR or SQuAD2.0, is effective in differentiating between strong and weak models.
arXiv Detail & Related papers (2021-06-01T22:33:53Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.