How Important is Domain Specificity in Language Models and Instruction
Finetuning for Biomedical Relation Extraction?
- URL: http://arxiv.org/abs/2402.13470v1
- Date: Wed, 21 Feb 2024 01:57:58 GMT
- Title: How Important is Domain Specificity in Language Models and Instruction
Finetuning for Biomedical Relation Extraction?
- Authors: Aviv Brokman and Ramakanth Kavuluru
- Abstract summary: General-domain models typically outperformed biomedical-domain models.
biomedical instruction finetuning improved performance to a similar degree as general instruction finetuning.
Our findings suggest it may be more fruitful to focus research effort on larger-scale biomedical instruction finetuning of general LMs.
- Score: 1.7555695340815782
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Cutting edge techniques developed in the general NLP domain are often
subsequently applied to the high-value, data-rich biomedical domain. The past
few years have seen generative language models (LMs), instruction finetuning,
and few-shot learning become foci of NLP research. As such, generative LMs
pretrained on biomedical corpora have proliferated and biomedical instruction
finetuning has been attempted as well, all with the hope that domain
specificity improves performance on downstream tasks. Given the nontrivial
effort in training such models, we investigate what, if any, benefits they have
in the key biomedical NLP task of relation extraction. Specifically, we address
two questions: (1) Do LMs trained on biomedical corpora outperform those
trained on general domain corpora? (2) Do models instruction finetuned on
biomedical datasets outperform those finetuned on assorted datasets or those
simply pretrained? We tackle these questions using existing LMs, testing across
four datasets. In a surprising result, general-domain models typically
outperformed biomedical-domain models. However, biomedical instruction
finetuning improved performance to a similar degree as general instruction
finetuning, despite having orders of magnitude fewer instructions. Our findings
suggest it may be more fruitful to focus research effort on larger-scale
biomedical instruction finetuning of general LMs over building domain-specific
biomedical LMs
Related papers
- BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text [82.7001841679981]
BioMedLM is a 2.7 billion parameter GPT-style autoregressive model trained exclusively on PubMed abstracts and full articles.
When fine-tuned, BioMedLM can produce strong multiple-choice biomedical question-answering results competitive with larger models.
BioMedLM can also be fine-tuned to produce useful answers to patient questions on medical topics.
arXiv Detail & Related papers (2024-03-27T10:18:21Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - BIOptimus: Pre-training an Optimal Biomedical Language Model with
Curriculum Learning for Named Entity Recognition [0.0]
Using language models (LMs) pre-trained in a self-supervised setting on large corpora has helped to deal with the problem of limited label data.
Recent research in biomedical language processing has offered a number of biomedical LMs pre-trained.
This paper aims to investigate different pre-training methods, such as pre-training the biomedical LM from scratch and pre-training it in a continued fashion.
arXiv Detail & Related papers (2023-08-16T18:48:01Z) - Biomedical Language Models are Robust to Sub-optimal Tokenization [30.175714262031253]
Most modern biomedical language models (LMs) are pre-trained using standard domain-specific tokenizers.
We find that pre-training a biomedical LM using a more accurate biomedical tokenizer does not improve the entity representation quality of a language model.
arXiv Detail & Related papers (2023-06-30T13:35:24Z) - BioGPT: Generative Pre-trained Transformer for Biomedical Text
Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature.
We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z) - Fine-Tuning Large Neural Language Models for Biomedical Natural Language
Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP.
We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains.
We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z) - Pre-trained Language Models in Biomedical Domain: A Systematic Survey [33.572502204216256]
Pre-trained language models (PLMs) have been the de facto paradigm for most natural language processing (NLP) tasks.
This paper summarizes the recent progress of pre-trained language models in the biomedical domain and their applications in biomedical downstream tasks.
arXiv Detail & Related papers (2021-10-11T05:30:30Z) - Can Language Models be Biomedical Knowledge Bases? [18.28724653601921]
We create the BioLAMA benchmark comprised of 49K biomedical factual knowledge triples for probing biomedical LMs.
We find that biomedical LMs with recently proposed probing methods can achieve up to 18.51% Acc@5 on retrieving biomedical knowledge.
arXiv Detail & Related papers (2021-09-15T08:34:56Z) - Recognising Biomedical Names: Challenges and Solutions [9.51284672475743]
We propose a transition-based NER model which can recognise discontinuous mentions.
We also develop a cost-effective approach that nominates the suitable pre-training data.
Our contributions have obvious practical implications, especially when new biomedical applications are needed.
arXiv Detail & Related papers (2021-06-23T08:20:13Z) - Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.