Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into
BERT
- URL: http://arxiv.org/abs/2109.04810v1
- Date: Fri, 10 Sep 2021 11:54:25 GMT
- Title: Mixture-of-Partitions: Infusing Large Biomedical Knowledge Graphs into
BERT
- Authors: Zaiqiao Meng, Fangyu Liu, Thomas Hikaru Clark, Ehsan Shareghi, Nigel
Collier
- Abstract summary: Mixture-of-Partitions (MoP) can handle a very large knowledge graph (KG) by partitioning it into smaller sub-graphs and infusing their specific knowledge into various BERT models using lightweight adapters.
We evaluate our MoP with three biomedical BERTs (SciBERT, BioBERT, PubmedBERT) on six downstream tasks (inc. NLI, QA, Classification), and the results show that our MoP consistently enhances the underlying BERTs in task performance.
- Score: 17.739843061394367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Infusing factual knowledge into pre-trained models is fundamental for many
knowledge-intensive tasks. In this paper, we proposed Mixture-of-Partitions
(MoP), an infusion approach that can handle a very large knowledge graph (KG)
by partitioning it into smaller sub-graphs and infusing their specific
knowledge into various BERT models using lightweight adapters. To leverage the
overall factual knowledge for a target task, these sub-graph adapters are
further fine-tuned along with the underlying BERT through a mixture layer. We
evaluate our MoP with three biomedical BERTs (SciBERT, BioBERT, PubmedBERT) on
six downstream tasks (inc. NLI, QA, Classification), and the results show that
our MoP consistently enhances the underlying BERTs in task performance, and
achieves new SOTA performances on five evaluated datasets.
Related papers
- Graph Relation Distillation for Efficient Biomedical Instance
Segmentation [80.51124447333493]
We propose a graph relation distillation approach for efficient biomedical instance segmentation.
We introduce two graph distillation schemes deployed at both the intra-image level and the inter-image level.
Experimental results on a number of biomedical datasets validate the effectiveness of our approach.
arXiv Detail & Related papers (2024-01-12T04:41:23Z) - Diversifying Knowledge Enhancement of Biomedical Language Models using
Adapter Modules and Knowledge Graphs [54.223394825528665]
We develop an approach that uses lightweight adapter modules to inject structured biomedical knowledge into pre-trained language models.
We use two large KGs, the biomedical knowledge system UMLS and the novel biochemical OntoChem, with two prominent biomedical PLMs, PubMedBERT and BioLinkBERT.
We show that our methodology leads to performance improvements in several instances while keeping requirements in computing power low.
arXiv Detail & Related papers (2023-12-21T14:26:57Z) - Adapted Multimodal BERT with Layer-wise Fusion for Sentiment Analysis [84.12658971655253]
We propose Adapted Multimodal BERT, a BERT-based architecture for multimodal tasks.
adapter adjusts the pretrained language model for the task at hand, while the fusion layers perform task-specific, layer-wise fusion of audio-visual information with textual BERT representations.
In our ablations we see that this approach leads to efficient models, that can outperform their fine-tuned counterparts and are robust to input noise.
arXiv Detail & Related papers (2022-12-01T17:31:42Z) - On the Effectiveness of Compact Biomedical Transformers [12.432191400869002]
Language models pre-trained on biomedical corpora have recently shown promising results on downstream biomedical tasks.
Many existing pre-trained models are resource-intensive and computationally heavy owing to factors such as embedding size, hidden dimension, and number of layers.
We introduce six lightweight models, namely, BioDistilBERT, BioTinyBERT, BioMobileBERT, DistilBioBERT, TinyBioBERT, and CompactBioBERT.
We evaluate all of our models on three biomedical tasks and compare them with BioBERT-v1.1 to create efficient lightweight models that perform on par with their larger counterparts.
arXiv Detail & Related papers (2022-09-07T14:24:04Z) - KnowDA: All-in-One Knowledge Mixture Model for Data Augmentation in
Few-Shot NLP [68.43279384561352]
Existing data augmentation algorithms leverage task-independent rules or fine-tune general-purpose pre-trained language models.
These methods have trivial task-specific knowledge and are limited to yielding low-quality synthetic data for weak baselines in simple tasks.
We propose the Knowledge Mixture Data Augmentation Model (KnowDA): an encoder-decoder LM pretrained on a mixture of diverse NLP tasks.
arXiv Detail & Related papers (2022-06-21T11:34:02Z) - Which Student is Best? A Comprehensive Knowledge Distillation Exam for
Task-Specific BERT Models [3.303435360096988]
We perform knowledge distillation benchmark from task-specific BERT-base teacher models to various student models.
Our experiment involves 12 datasets grouped in two tasks: text classification and sequence labeling in the Indonesian language.
Our experiments show that, despite the rising popularity of Transformer-based models, using BiLSTM and CNN student models provide the best trade-off between performance and computational resource.
arXiv Detail & Related papers (2022-01-03T10:07:13Z) - Scientific Language Models for Biomedical Knowledge Base Completion: An
Empirical Study [62.376800537374024]
We study scientific LMs for KG completion, exploring whether we can tap into their latent knowledge to enhance biomedical link prediction.
We integrate the LM-based models with KG embedding models, using a router method that learns to assign each input example to either type of model and provides a substantial boost in performance.
arXiv Detail & Related papers (2021-06-17T17:55:33Z) - A Hybrid Approach to Measure Semantic Relatedness in Biomedical Concepts [0.0]
We generated concept vectors by encoding concept preferred terms using ELMo, BERT, and Sentence BERT models.
We trained all the BERT models using Siamese network on SNLI and STSb datasets to allow the models to learn more semantic information.
Injecting ontology knowledge into concept vectors further enhances their quality and contributes to better relatedness scores.
arXiv Detail & Related papers (2021-01-25T16:01:27Z) - Investigation of BERT Model on Biomedical Relation Extraction Based on
Revised Fine-tuning Mechanism [2.8881198461098894]
We will investigate the method of utilizing the entire layer in the fine-tuning process of BERT model.
In addition, further analysis indicates that the key knowledge about the relations can be learned from the last layer of BERT model.
arXiv Detail & Related papers (2020-11-01T01:47:16Z) - Students Need More Attention: BERT-based AttentionModel for Small Data
with Application to AutomaticPatient Message Triage [65.7062363323781]
We propose a novel framework based on BioBERT (Bidirectional Representations from Transformers forBiomedical TextMining)
We introduce Label Embeddings for Self-Attention in each layer of BERT, which we call LESA-BERT, and (ii) by distilling LESA-BERT to smaller variants, we aim to reduce overfitting and model size when working on small datasets.
As an application, our framework is utilized to build a model for patient portal message triage that classifies the urgency of a message into three categories: non-urgent, medium and urgent.
arXiv Detail & Related papers (2020-06-22T03:39:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.