An Experimental Evaluation of Transformer-based Language Models in the
Biomedical Domain
- URL: http://arxiv.org/abs/2012.15419v1
- Date: Thu, 31 Dec 2020 03:09:38 GMT
- Title: An Experimental Evaluation of Transformer-based Language Models in the
Biomedical Domain
- Authors: Paul Grouchy, Shobhit Jain, Michael Liu, Kuhan Wang, Max Tian, Nidhi
Arora, Hillary Ngai, Faiza Khan Khattak, Elham Dolatabadi, Sedef Akinli Kocak
- Abstract summary: This paper summarizes experiments conducted in replicating BioBERT and further pre-training and fine-tuning in the biomedical domain.
We also investigate the effectiveness of domain-specific and domain-agnostic pre-trained models across downstream biomedical NLP tasks.
- Score: 0.984441002699829
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the growing amount of text in health data, there have been rapid
advances in large pre-trained models that can be applied to a wide variety of
biomedical tasks with minimal task-specific modifications. Emphasizing the cost
of these models, which renders technical replication challenging, this paper
summarizes experiments conducted in replicating BioBERT and further
pre-training and careful fine-tuning in the biomedical domain. We also
investigate the effectiveness of domain-specific and domain-agnostic
pre-trained models across downstream biomedical NLP tasks. Our finding confirms
that pre-trained models can be impactful in some downstream NLP tasks (QA and
NER) in the biomedical domain; however, this improvement may not justify the
high cost of domain-specific pre-training.
Related papers
- Towards a clinically accessible radiology foundation model: open-access and lightweight, with automated evaluation [113.5002649181103]
Training open-source small multimodal models (SMMs) to bridge competency gaps for unmet clinical needs in radiology.
For training, we assemble a large dataset of over 697 thousand radiology image-text pairs.
For evaluation, we propose CheXprompt, a GPT-4-based metric for factuality evaluation, and demonstrate its parity with expert evaluation.
The inference of LlaVA-Rad is fast and can be performed on a single V100 GPU in private settings, offering a promising state-of-the-art tool for real-world clinical applications.
arXiv Detail & Related papers (2024-03-12T18:12:02Z) - Multi-level biomedical NER through multi-granularity embeddings and
enhanced labeling [3.8599767910528917]
This paper proposes a hybrid approach that integrates the strengths of multiple models.
BERT provides contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text.
We evaluate our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11.
arXiv Detail & Related papers (2023-12-24T21:45:36Z) - BIOptimus: Pre-training an Optimal Biomedical Language Model with
Curriculum Learning for Named Entity Recognition [0.0]
Using language models (LMs) pre-trained in a self-supervised setting on large corpora has helped to deal with the problem of limited label data.
Recent research in biomedical language processing has offered a number of biomedical LMs pre-trained.
This paper aims to investigate different pre-training methods, such as pre-training the biomedical LM from scratch and pre-training it in a continued fashion.
arXiv Detail & Related papers (2023-08-16T18:48:01Z) - BiomedGPT: A Generalist Vision-Language Foundation Model for Diverse Biomedical Tasks [68.39821375903591]
Generalist AI holds the potential to address limitations due to its versatility in interpreting different data types.
Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model.
arXiv Detail & Related papers (2023-05-26T17:14:43Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Multi-Task Prediction of Clinical Outcomes in the Intensive Care Unit
using Flexible Multimodal Transformers [4.836546574465437]
We propose a flexible Transformer-based EHR embedding pipeline and predictive model framework.
We showcase the feasibility of our flexible design in a case study in the intensive care unit.
arXiv Detail & Related papers (2021-11-09T21:46:11Z) - Recognising Biomedical Names: Challenges and Solutions [9.51284672475743]
We propose a transition-based NER model which can recognise discontinuous mentions.
We also develop a cost-effective approach that nominates the suitable pre-training data.
Our contributions have obvious practical implications, especially when new biomedical applications are needed.
arXiv Detail & Related papers (2021-06-23T08:20:13Z) - Domain Generalization on Medical Imaging Classification using Episodic
Training with Task Augmentation [62.49837463676111]
We propose a novel scheme of episodic training with task augmentation on medical imaging classification.
Motivated by the limited number of source domains in real-world medical deployment, we consider the unique task-level overfitting.
arXiv Detail & Related papers (2021-06-13T03:56:59Z) - Unsupervised Pre-training for Biomedical Question Answering [32.525495687236194]
We introduce a new pre-training task from unlabeled data designed to reason about biomedical entities in the context.
Our experiments show that pre-training BioBERT on the proposed pre-training task significantly boosts performance and outperforms the previous best model from the 7th BioASQ Task 7b-Phase B challenge.
arXiv Detail & Related papers (2020-09-27T21:07:51Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z) - Predicting Clinical Diagnosis from Patients Electronic Health Records
Using BERT-based Neural Networks [62.9447303059342]
We show the importance of this problem in medical community.
We present a modification of Bidirectional Representations from Transformers (BERT) model for classification sequence.
We use a large-scale Russian EHR dataset consisting of about 4 million unique patient visits.
arXiv Detail & Related papers (2020-07-15T09:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.