Clinical Relation Extraction Using Transformer-based Models
- URL: http://arxiv.org/abs/2107.08957v1
- Date: Mon, 19 Jul 2021 15:15:51 GMT
- Title: Clinical Relation Extraction Using Transformer-based Models
- Authors: Xi Yang, Zehao Yu, Yi Guo, Jiang Bian and Yonghui Wu
- Abstract summary: We developed a series of clinical RE models based on three transformer architectures, namely BERT, RoBERTa, and XLNet.
We demonstrated that the RoBERTa-clinical RE model achieved the best performance on the 2018 MADE1.0 dataset with an F1-score of 0.8958.
Our results indicated that the binary classification strategy consistently outperformed the multi-class classification strategy for clinical relation extraction.
- Score: 28.237302721228435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The newly emerged transformer technology has a tremendous impact on NLP
research. In the general English domain, transformer-based models have achieved
state-of-the-art performances on various NLP benchmarks. In the clinical
domain, researchers also have investigated transformer models for clinical
applications. The goal of this study is to systematically explore three widely
used transformer-based models (i.e., BERT, RoBERTa, and XLNet) for clinical
relation extraction and develop an open-source package with clinical
pre-trained transformer-based models to facilitate information extraction in
the clinical domain. We developed a series of clinical RE models based on three
transformer architectures, namely BERT, RoBERTa, and XLNet. We evaluated these
models using 2 publicly available datasets from 2018 MADE1.0 and 2018 n2c2
challenges. We compared two classification strategies (binary vs. multi-class
classification) and investigated two approaches to generate candidate relations
in different experimental settings. In this study, we compared three
transformer-based (BERT, RoBERTa, and XLNet) models for relation extraction. We
demonstrated that the RoBERTa-clinical RE model achieved the best performance
on the 2018 MADE1.0 dataset with an F1-score of 0.8958. On the 2018 n2c2
dataset, the XLNet-clinical model achieved the best F1-score of 0.9610. Our
results indicated that the binary classification strategy consistently
outperformed the multi-class classification strategy for clinical relation
extraction. Our methods and models are publicly available at
https://github.com/uf-hobi-informatics-lab/ClinicalTransformerRelationExtraction.
We believe this work will improve current practice on clinical relation
extraction and other related NLP tasks in the biomedical domain.
Related papers
- Medical-GAT: Cancer Document Classification Leveraging Graph-Based Residual Network for Scenarios with Limited Data [2.913761513290171]
We present a curated dataset of 1,874 biomedical abstracts, categorized into thyroid cancer, colon cancer, lung cancer, and generic topics.
Our research focuses on leveraging this dataset to improve classification performance, particularly in data-scarce scenarios.
We introduce a Residual Graph Attention Network (R-GAT) with multiple graph attention layers that capture the semantic information and structural relationships within cancer-related documents.
arXiv Detail & Related papers (2024-10-19T20:07:40Z) - The Languini Kitchen: Enabling Language Modelling Research at Different
Scales of Compute [66.84421705029624]
We introduce an experimental protocol that enables model comparisons based on equivalent compute, measured in accelerator hours.
We pre-process an existing large, diverse, and high-quality dataset of books that surpasses existing academic benchmarks in quality, diversity, and document length.
This work also provides two baseline models: a feed-forward model derived from the GPT-2 architecture and a recurrent model in the form of a novel LSTM with ten-fold throughput.
arXiv Detail & Related papers (2023-09-20T10:31:17Z) - The effect of data augmentation and 3D-CNN depth on Alzheimer's Disease
detection [51.697248252191265]
This work summarizes and strictly observes best practices regarding data handling, experimental design, and model evaluation.
We focus on Alzheimer's Disease (AD) detection, which serves as a paradigmatic example of challenging problem in healthcare.
Within this framework, we train predictive 15 models, considering three different data augmentation strategies and five distinct 3D CNN architectures.
arXiv Detail & Related papers (2023-09-13T10:40:41Z) - LegoNet: Alternating Model Blocks for Medical Image Segmentation [0.7550390281305251]
We propose to alternate structurally different types of blocks to generate a new architecture, mimicking how Lego blocks can be assembled together.
Using two CNN-based and one SwinViT-based blocks, we investigate three variations to the so-called LegoNet that applies the new concept of block alternation for the segmentation task in medical imaging.
arXiv Detail & Related papers (2023-06-06T08:22:47Z) - Lightweight Transformers for Clinical Natural Language Processing [9.532776962985828]
This study focuses on development of compact language models for processing clinical texts.
We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning.
Our evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks.
arXiv Detail & Related papers (2023-02-09T16:07:31Z) - Application of Deep Learning in Generating Structured Radiology Reports:
A Transformer-Based Technique [0.4549831511476247]
Natural language processing techniques can facilitate automatic information extraction and transformation of free-text formats to structured data.
Deep learning (DL)-based models have been adapted for NLP experiments with promising results.
In this study, we propose a transformer-based fine-grained named entity recognition architecture for clinical information extraction.
arXiv Detail & Related papers (2022-09-25T08:03:15Z) - Deeper Clinical Document Understanding Using Relation Extraction [0.0]
We propose a text mining framework comprising of Named Entity Recognition (NER) and Relation Extraction (RE) models.
We introduce two new RE model architectures -- an accuracy-optimized one based on BioBERT and a speed-optimized one utilizing crafted features over a Fully Connected Neural Network (FCNN)
We show two practical applications of this framework -- for building a biomedical knowledge graph and for improving the accuracy of mapping entities to clinical codes.
arXiv Detail & Related papers (2021-12-25T17:14:13Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Few-Shot Named Entity Recognition: A Comprehensive Study [92.40991050806544]
We investigate three schemes to improve the model generalization ability for few-shot settings.
We perform empirical comparisons on 10 public NER datasets with various proportions of labeled data.
We create new state-of-the-art results on both few-shot and training-free settings.
arXiv Detail & Related papers (2020-12-29T23:43:16Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z) - Ensemble Transfer Learning for the Prediction of Anti-Cancer Drug
Response [49.86828302591469]
In this paper, we apply transfer learning to the prediction of anti-cancer drug response.
We apply the classic transfer learning framework that trains a prediction model on the source dataset and refines it on the target dataset.
The ensemble transfer learning pipeline is implemented using LightGBM and two deep neural network (DNN) models with different architectures.
arXiv Detail & Related papers (2020-05-13T20:29:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.