On the Effectiveness of Compact Biomedical Transformers
- URL: http://arxiv.org/abs/2209.03182v1
- Date: Wed, 7 Sep 2022 14:24:04 GMT
- Title: On the Effectiveness of Compact Biomedical Transformers
- Authors: Omid Rohanian, Mohammadmahdi Nouriborji, Samaneh Kouchaki, David A.
Clifton
- Abstract summary: Language models pre-trained on biomedical corpora have recently shown promising results on downstream biomedical tasks.
Many existing pre-trained models are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension, and number of layers.
We introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT, and CompactBioBERT.
We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create efficient lightweight models that perform on par with their larger counterparts.
- Score: 12.432191400869002
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Language models pre-trained on biomedical corpora, such as BioBERT, have
recently shown promising results on downstream biomedical tasks. Many existing
pre-trained models, on the other hand, are resource-intensive and
computationally heavy owing to factors such as embedding size, hidden
dimension, and number of layers. The natural language processing (NLP)
community has developed numerous strategies to compress these models utilising
techniques such as pruning, quantisation, and knowledge distillation, resulting
in models that are considerably faster, smaller, and subsequently easier to use
in practice. By the same token, in this paper we introduce six lightweight
models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT,
TinyBioBERT, and CompactBioBERT which are obtained either by knowledge
distillation from a biomedical teacher or continual learning on the Pubmed
dataset via the Masked Language Modelling (MLM) objective. We evaluate all of
our models on three biomedical tasks and compare them with BioBERT-v1.1 to
create efficient lightweight models that perform on par with their larger
counterparts. All the models will be publicly available on our Huggingface
profile at https://huggingface.co/nlpie and the codes used to run the
experiments will be available at
https://github.com/nlpie-research/Compact-Biomedical-Transformers.
Related papers
- BMRetriever: Tuning Large Language Models as Better Biomedical Text Retrievers [48.21255861863282]
BMRetriever is a series of dense retrievers for enhancing biomedical retrieval.
BMRetriever exhibits strong parameter efficiency, with the 410M variant outperforming baselines up to 11.7 times larger.
arXiv Detail & Related papers (2024-04-29T05:40:08Z) - Automated Text Mining of Experimental Methodologies from Biomedical Literature [0.087024326813104]
DistilBERT is a methodology-specific, pre-trained generative classification language model for mining biomedicine texts.
It has proven its effectiveness in linguistic understanding capabilities and has reduced the size of BERT models by 40% but by 60% faster.
Our aim is to integrate this highly specialised and specific model into different research industries.
arXiv Detail & Related papers (2024-04-21T21:19:36Z) - BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text [82.7001841679981]
BioMedLM is a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles.
When fine-tuned, BioMedLM can produce strong multiple-choice biomedical question-answering results competitive with larger models.
BioMedLM can also be fine-tuned to produce useful answers to patient questions on medical topics.
arXiv Detail & Related papers (2024-03-27T10:18:21Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Lightweight Transformers for Clinical Natural Language Processing [9.532776962985828]
This study focuses on development of compact language models for processing clinical texts.
We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning.
Our evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks.
arXiv Detail & Related papers (2023-02-09T16:07:31Z) - Bioformer: an efficient transformer language model for biomedical text
mining [8.961510810015643]
We present Bioformer, a compact BERT model for biomedical text mining.
We pretrained two Bioformer models which reduced the model size by 60% compared to BERTBase.
With 60% fewer parameters, Bioformer16L is only 0.1% less accurate than PubMedBERT.
arXiv Detail & Related papers (2023-02-03T08:04:59Z) - BioGPT: Generative Pre-trained Transformer for Biomedical Text
Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature.
We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z) - Sparse*BERT: Sparse Models Generalize To New tasks and Domains [79.42527716035879]
This paper studies how models pruned using Gradual Unstructured Magnitude Pruning can transfer between domains and tasks.
We demonstrate that our general sparse model Sparse*BERT can become SparseBioBERT simply by pretraining the compressed architecture on unstructured biomedical text.
arXiv Detail & Related papers (2022-05-25T02:51:12Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Evaluating Biomedical BERT Models for Vocabulary Alignment at Scale in
the UMLS Metathesaurus [8.961270657070942]
The current UMLS (Unified Medical Language System) Metathesaurus construction process is expensive and error-prone.
Recent advances in Natural Language Processing have achieved state-of-the-art (SOTA) performance on downstream tasks.
We aim to validate if approaches using the BERT models can actually outperform the existing approaches for predicting synonymy in the UMLS Metathesaurus.
arXiv Detail & Related papers (2021-09-14T16:52:16Z) - BioALBERT: A Simple and Effective Pre-trained Language Model for
Biomedical Named Entity Recognition [9.05154470433578]
Existing BioNER approaches often neglect these issues and directly adopt the state-of-the-art (SOTA) models.
We propose biomedical ALBERT, an effective domain-specific language model trained on large-scale biomedical corpora.
arXiv Detail & Related papers (2020-09-19T12:58:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.