Related papers: AMMU -- A Survey of Transformer-based Biomedical Pretrained Language Models

AMMU -- A Survey of Transformer-based Biomedical Pretrained Language Models

URL: http://arxiv.org/abs/2105.00827v1
Date: Fri, 16 Apr 2021 18:09:51 GMT
Title: AMMU -- A Survey of Transformer-based Biomedical Pretrained Language Models
Authors: Katikapalli Subramanyam Kalyan, Ajit Rajasekharan, Sivanesan Sangeetha
Abstract summary: Transformer-based pretrained language models (PLMs) have started a new era in modern natural language processing (NLP) Following the success of these models in the general domain, the biomedical research community has developed various in-domain PLMs.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transformer-based pretrained language models (PLMs) have started a new era in modern natural language processing (NLP). These models combine the power of transformers, transfer learning, and self-supervised learning (SSL). Following the success of these models in the general domain, the biomedical research community has developed various in-domain PLMs starting from BioBERT to the latest BioMegatron and CoderBERT models. We strongly believe there is a need for a survey paper that can provide a comprehensive survey of various transformer-based biomedical pretrained language models (BPLMs). In this survey, we start with a brief overview of foundational concepts like self-supervised learning, embedding layer and transformer encoder layers. We discuss core concepts of transformer-based PLMs like pretraining methods, pretraining tasks, fine-tuning methods, and various embedding types specific to biomedical domain. We introduce a taxonomy for transformer-based BPLMs and then discuss all the models. We discuss various challenges and present possible solutions. We conclude by highlighting some of the open issues which will drive the research community to further improve transformer-based BPLMs.

Related papers

PyTDC: A multimodal machine learning training, evaluation, and inference platform for biomedical foundation models [59.17570021208177]
PyTDC is a machine-learning platform providing streamlined training, evaluation, and inference software for multimodal biological AI models.<n>This paper discusses the components of PyTDC's architecture and, to our knowledge, the first-of-its-kind case study on the introduced single-cell drug-target nomination ML task.
arXiv Detail & Related papers (2025-05-08T18:15:38Z)
A Review on the Applications of Transformer-based language models for Nucleotide Sequence Analysis [0.8049701904919515]
This paper introduces the major developments of Transformer-based models in the recent past in the context of nucleotide sequences. We believe this review will help the scientific community in understanding the various applications of Transformer-based language models to nucleotide sequences.
arXiv Detail & Related papers (2024-12-10T05:33:09Z)
Is Modularity Transferable? A Case Study through the Lens of Knowledge Distillation [59.37775534633868]
We present an extremely straightforward approach to transferring pre-trained, task-specific PEFT modules between same-family PLMs. We also propose a method that allows the transfer of modules between incompatible PLMs without any change in the inference complexity.
arXiv Detail & Related papers (2024-03-27T17:50:00Z)
Advancing bioinformatics with large language models: components, applications and perspectives [12.728981464533918]
Large language models (LLMs) are a class of artificial intelligence models based on deep learning. We will provide a comprehensive overview of the essential components of large language models (LLMs) in bioinformatics. Key aspects covered include tokenization methods for diverse data types, the architecture of transformer models, and the core attention mechanism.
arXiv Detail & Related papers (2024-01-08T17:26:59Z)
Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models. We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT. We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z)
Introduction to Transformers: an NLP Perspective [59.0241868728732]
We introduce basic concepts of Transformers and present key techniques that form the recent advances of these models. This includes a description of the standard Transformer architecture, a series of model refinements, and common applications.
arXiv Detail & Related papers (2023-11-29T13:51:04Z)
UMLS-KGI-BERT: Data-Centric Knowledge Integration in Transformers for Biomedical Entity Recognition [4.865221751784403]
This work contributes a data-centric paradigm for enriching the language representations of biomedical transformer-encoder LMs by extracting text sequences from the UMLS. Preliminary results from experiments in the extension of pre-trained LMs as well as training from scratch show that this framework improves downstream performance on multiple biomedical and clinical Named Entity Recognition (NER) tasks.
arXiv Detail & Related papers (2023-07-20T18:08:34Z)
PASTA: Pretrained Action-State Transformer Agents [10.654719072766495]
Self-supervised learning has brought about a revolutionary paradigm shift in various computing domains. Recent approaches involve pre-training transformer models on vast amounts of unlabeled data. In reinforcement learning, researchers have recently adapted these approaches, developing models pre-trained on expert trajectories.
arXiv Detail & Related papers (2023-07-20T15:09:06Z)
A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks [60.38369406877899]
Transformer is a deep neural network that employs a self-attention mechanism to comprehend the contextual relationships within sequential data. transformer models excel in handling long dependencies between input sequence elements and enable parallel processing. Our survey encompasses the identification of the top five application domains for transformer-based models.
arXiv Detail & Related papers (2023-06-11T23:13:51Z)
Do Transformers Encode a Foundational Ontology? Probing Abstract Classes in Natural Language [2.363388546004777]
We present a systematic Foundational Ontology probing methodology to investigate whether Transformers-based models encode abstract semantic information. We show that Transformer-based models incidentally encode information related to Foundational Ontologies during the pre-training pro-cess.
arXiv Detail & Related papers (2022-01-25T12:11:46Z)
Transformers for prompt-level EMA non-response prediction [62.41658786277712]
Ecological Momentary Assessments (EMAs) are an important psychological data source for measuring cognitive states, affect, behavior, and environmental factors. Non-response, in which participants fail to respond to EMA prompts, is an endemic problem. The ability to accurately predict non-response could be utilized to improve EMA delivery and develop compliance interventions.
arXiv Detail & Related papers (2021-11-01T18:38:47Z)
Pre-trained Language Models in Biomedical Domain: A Systematic Survey [33.572502204216256]
Pre-trained language models (PLMs) have been the de facto paradigm for most natural language processing (NLP) tasks. This paper summarizes the recent progress of pre-trained language models in the biomedical domain and their applications in biomedical downstream tasks.
arXiv Detail & Related papers (2021-10-11T05:30:30Z)
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing [0.0]
Transformer-based pretrained language models (T-PTLMs) have achieved great success in almost every NLP task. Transformed-based PTLMs learn universal language representations from large volumes of text data using self-supervised learning. These models provide good background knowledge to downstream tasks which avoids training of downstream models from scratch.
arXiv Detail & Related papers (2021-08-12T05:32:18Z)
Parameter Efficient Multimodal Transformers for Video Representation Learning [108.8517364784009]
This work focuses on reducing the parameters of multimodal Transformers in the context of audio-visual video representation learning. We show that our approach reduces parameters up to 80$%$, allowing us to train our model end-to-end from scratch. To demonstrate our approach, we pretrain our model on 30-second clips from Kinetics-700 and transfer it to audio-visual classification tasks.
arXiv Detail & Related papers (2020-12-08T00:16:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.