The Utility of General Domain Transfer Learning for Medical Language
Tasks
- URL: http://arxiv.org/abs/2002.06670v1
- Date: Sun, 16 Feb 2020 20:20:38 GMT
- Title: The Utility of General Domain Transfer Learning for Medical Language
Tasks
- Authors: Daniel Ranti, Katie Hanss, Shan Zhao, Varun Arvind, Joseph Titano,
Anthony Costa, Eric Oermann
- Abstract summary: The purpose of this study is to analyze the efficacy of transfer learning techniques and transformer-based models as applied to medical natural language processing (NLP) tasks.
General text transfer learning may be a viable technique to generate state-of-the-art results within medical NLP tasks on radiological corpora.
- Score: 1.5459429010135775
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The purpose of this study is to analyze the efficacy of transfer learning
techniques and transformer-based models as applied to medical natural language
processing (NLP) tasks, specifically radiological text classification. We used
1,977 labeled head CT reports, from a corpus of 96,303 total reports, to
evaluate the efficacy of pretraining using general domain corpora and a
combined general and medical domain corpus with a bidirectional representations
from transformers (BERT) model for the purpose of radiological text
classification. Model performance was benchmarked to a logistic regression
using bag-of-words vectorization and a long short-term memory (LSTM)
multi-label multi-class classification model, and compared to the published
literature in medical text classification. The BERT models using either set of
pretrained checkpoints outperformed the logistic regression model, achieving
sample-weighted average F1-scores of 0.87 and 0.87 for the general domain model
and the combined general and biomedical-domain model. General text transfer
learning may be a viable technique to generate state-of-the-art results within
medical NLP tasks on radiological corpora, outperforming other deep models such
as LSTMs. The efficacy of pretraining and transformer-based models could serve
to facilitate the creation of groundbreaking NLP models in the uniquely
challenging data environment of medical text.
Related papers
- LLMs-in-the-loop Part-1: Expert Small AI Models for Bio-Medical Text Translation [0.0]
This study introduces a novel "LLMs-in-the-loop" approach to develop supervised neural machine translation models optimized for medical texts.
Custom parallel corpora in six languages were compiled from scientific articles, synthetically generated clinical documents, and medical texts.
Our MarianMT-based models outperform Google Translate, DeepL, and GPT-4-Turbo.
arXiv Detail & Related papers (2024-07-16T19:32:23Z) - Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Multi-level biomedical NER through multi-granularity embeddings and
enhanced labeling [3.8599767910528917]
This paper proposes a hybrid approach that integrates the strengths of multiple models.
BERT provides contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text.
We evaluate our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11.
arXiv Detail & Related papers (2023-12-24T21:45:36Z) - PathLDM: Text conditioned Latent Diffusion Model for Histopathology [62.970593674481414]
We introduce PathLDM, the first text-conditioned Latent Diffusion Model tailored for generating high-quality histopathology images.
Our approach fuses image and textual data to enhance the generation process.
We achieved a SoTA FID score of 7.64 for text-to-image generation on the TCGA-BRCA dataset, significantly outperforming the closest text-conditioned competitor with FID 30.1.
arXiv Detail & Related papers (2023-09-01T22:08:32Z) - UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for
Biomedical Entity Recognition [4.865221751784403]
This work contributes a data-centric paradigm for enriching the language representations of biomedical transformer-encoder LMs by extracting text sequences from the UMLS.
Preliminary results from experiments in the extension of pre-trained LMs as well as training from scratch show that this framework improves downstream performance on multiple biomedical and clinical Named Entity Recognition (NER) tasks.
arXiv Detail & Related papers (2023-07-20T18:08:34Z) - Customizing General-Purpose Foundation Models for Medical Report
Generation [64.31265734687182]
The scarcity of labelled medical image-report pairs presents great challenges in the development of deep and large-scale neural networks.
We propose customizing off-the-shelf general-purpose large-scale pre-trained models, i.e., foundation models (FMs) in computer vision and natural language processing.
arXiv Detail & Related papers (2023-06-09T03:02:36Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Application of Deep Learning in Generating Structured Radiology Reports:
A Transformer-Based Technique [0.4549831511476247]
Natural language processing techniques can facilitate automatic information extraction and transformation of free-text formats to structured data.
Deep learning (DL)-based models have been adapted for NLP experiments with promising results.
In this study, we propose a transformer-based fine-grained named entity recognition architecture for clinical information extraction.
arXiv Detail & Related papers (2022-09-25T08:03:15Z) - A multi-stage machine learning model on diagnosis of esophageal
manometry [50.591267188664666]
The framework includes deep-learning models at the swallow-level stage and feature-based machine learning models at the study-level stage.
This is the first artificial-intelligence-style model to automatically predict CC diagnosis of HRM study from raw multi-swallow data.
arXiv Detail & Related papers (2021-06-25T20:09:23Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z) - Med7: a transferable clinical natural language processing model for
electronic health records [6.935142529928062]
We introduce a named-entity recognition model for clinical natural language processing.
The model is trained to recognise seven categories: drug names, route, frequency, dosage, strength, form, duration.
We evaluate the transferability of the developed model using the data from the Intensive Care Unit in the US to secondary care mental health records (CRIS) in the UK.
arXiv Detail & Related papers (2020-03-03T00:55:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.