SciFive: a text-to-text transformer model for biomedical literature
- URL: http://arxiv.org/abs/2106.03598v1
- Date: Fri, 28 May 2021 06:09:23 GMT
- Title: SciFive: a text-to-text transformer model for biomedical literature
- Authors: Long N. Phan, James T. Anibal, Hieu Tran, Shaurya Chanana, Erol
Bahadroglu, Alec Peltekian, Gr\'egoire Altan-Bonnet
- Abstract summary: We introduce SciFive, a domain-specific T5 model that has been pre-trained on large biomedical corpora.
Our results support the exploration of more difficult text generation tasks and the development of new methods in this area.
- Score: 0.9482369543628087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this report, we introduce SciFive, a domain-specific T5 model that has
been pre-trained on large biomedical corpora. Our model outperforms the current
SOTA methods (i.e. BERT, BioBERT, Base T5) on tasks in named entity relation,
relation extraction, natural language inference, and question-answering. We
show that text-generation methods have significant potential in a broad array
of biomedical NLP tasks, particularly those requiring longer, more complex
outputs. Our results support the exploration of more difficult text generation
tasks and the development of new methods in this area
Related papers
- Leveraging Biomolecule and Natural Language through Multi-Modal
Learning: A Survey [75.47055414002571]
The integration of biomolecular modeling with natural language (BL) has emerged as a promising interdisciplinary area at the intersection of artificial intelligence, chemistry and biology.
We provide an analysis of recent advancements achieved through cross modeling of biomolecules and natural language.
arXiv Detail & Related papers (2024-03-03T14:59:47Z) - BioT5+: Towards Generalized Biological Understanding with IUPAC Integration and Multi-task Tuning [77.90250740041411]
This paper introduces BioT5+, an extension of the BioT5 framework, tailored to enhance biological research and drug discovery.
BioT5+ incorporates several novel features: integration of IUPAC names for molecular understanding, inclusion of extensive bio-text and molecule data from sources like bioRxiv and PubChem, the multi-task instruction tuning for generality across tasks, and a numerical tokenization technique for improved processing of numerical data.
arXiv Detail & Related papers (2024-02-27T12:43:09Z) - An Evaluation of Large Language Models in Bioinformatics Research [52.100233156012756]
We study the performance of large language models (LLMs) on a wide spectrum of crucial bioinformatics tasks.
These tasks include the identification of potential coding regions, extraction of named entities for genes and proteins, detection of antimicrobial and anti-cancer peptides, molecular optimization, and resolution of educational bioinformatics problems.
Our findings indicate that, given appropriate prompts, LLMs like GPT variants can successfully handle most of these tasks.
arXiv Detail & Related papers (2024-02-21T11:27:31Z) - Multi-level biomedical NER through multi-granularity embeddings and
enhanced labeling [3.8599767910528917]
This paper proposes a hybrid approach that integrates the strengths of multiple models.
BERT provides contextualized word embeddings, a pre-trained multi-channel CNN for character-level information capture, and following by a BiLSTM + CRF for sequence labelling and modelling dependencies between the words in the text.
We evaluate our model on the benchmark i2b2/2010 dataset, achieving an F1-score of 90.11.
arXiv Detail & Related papers (2023-12-24T21:45:36Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Enhancing Biomedical Text Summarization and Question-Answering: On the
Utility of Domain-Specific Pre-Training [10.267057557137665]
We identify a suitable model architecture and use it to show a benefit of a general-domain pre-training followed by a task-specific fine-tuning.
Our results indicate that a Large Language Model without domain-specific pre-training can have a significant edge in some domain-specific biomedical text generation tasks.
arXiv Detail & Related papers (2023-07-10T08:32:45Z) - BioGPT: Generative Pre-trained Transformer for Biomedical Text
Generation and Mining [140.61707108174247]
We propose BioGPT, a domain-specific generative Transformer language model pre-trained on large scale biomedical literature.
We get 44.98%, 38.42% and 40.76% F1 score on BC5CDR, KD-DTI and DDI end-to-end relation extraction tasks respectively, and 78.2% accuracy on PubMedQA.
arXiv Detail & Related papers (2022-10-19T07:17:39Z) - ELECTRAMed: a new pre-trained language representation model for
biomedical NLP [0.0]
We propose a pre-trained domain-specific language model, called ELECTRAMed, suited for the biomedical field.
The novel approach inherits the learning framework of the general-domain ELECTRA architecture, as well as its computational advantages.
arXiv Detail & Related papers (2021-04-19T19:38:34Z) - Domain-Specific Language Model Pretraining for Biomedical Natural
Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains.
Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z) - An Empirical Study of Multi-Task Learning on BERT for Biomedical Text
Mining [17.10823632511911]
We study a multi-task learning model with multiple decoders on varieties of biomedical and clinical natural language processing tasks.
Our empirical results demonstrate that the MTL fine-tuned models outperform state-of-the-art transformer models.
arXiv Detail & Related papers (2020-05-06T13:25:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.