Injecting Knowledge into Biomedical Pre-trained Models via Polymorphism
and Synonymous Substitution
- URL: http://arxiv.org/abs/2305.15010v1
- Date: Wed, 24 May 2023 10:48:53 GMT
- Title: Injecting Knowledge into Biomedical Pre-trained Models via Polymorphism
and Synonymous Substitution
- Authors: Hongbo Zhang and Xiang Wan and Benyou Wang
- Abstract summary: Pre-trained language models (PLMs) were considered to be able to store relational knowledge present in the training data.
Low-frequency relational knowledge might be underexpressed compared to high-frequency one in PLMs.
We propose a simple-yet-effective approach to inject relational knowledge into PLMs.
- Score: 22.471123408160658
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models (PLMs) were considered to be able to store
relational knowledge present in the training data. However, some relational
knowledge seems to be discarded unsafely in PLMs due to \textbf{report bias}:
low-frequency relational knowledge might be underexpressed compared to
high-frequency one in PLMs. This gives us a hint that relational knowledge
might not be redundant to the stored knowledge of PLMs, but rather be
complementary. To additionally inject relational knowledge into PLMs, we
propose a simple-yet-effective approach to inject relational knowledge into
PLMs, which is inspired by three observations (namely, polymorphism, synonymous
substitution, and association). In particular, we switch entities in the
training corpus to related entities (either hypernyms/hyponyms/synonyms, or
arbitrarily-related concepts). Experimental results show that the proposed
approach could not only better capture relational knowledge, but also improve
the performance in various biomedical downstream tasks. Our model is available
in \url{https://github.com/StevenZHB/BioPLM_InjectingKnowledge}.
Related papers
- Robust and Scalable Model Editing for Large Language Models [75.95623066605259]
We propose EREN (Edit models by REading Notes) to improve the scalability and robustness of LLM editing.
Unlike existing techniques, it can integrate knowledge from multiple edits, and correctly respond to syntactically similar but semantically unrelated inputs.
arXiv Detail & Related papers (2024-03-26T06:57:23Z) - Improving Language Models Meaning Understanding and Consistency by
Learning Conceptual Roles from Dictionary [65.268245109828]
Non-human-like behaviour of contemporary pre-trained language models (PLMs) is a leading cause undermining their trustworthiness.
A striking phenomenon is the generation of inconsistent predictions, which produces contradictory results.
We propose a practical approach that alleviates the inconsistent behaviour issue by improving PLM awareness.
arXiv Detail & Related papers (2023-10-24T06:15:15Z) - Biomedical Entity Linking with Triple-aware Pre-Training [7.536753993136013]
We propose a framework to pre-train a powerful large language model (LLM) by a corpus synthesized from a KG.
In the evaluations we are unable to confirm the benefit of including synonym, description or relational information.
arXiv Detail & Related papers (2023-08-28T09:06:28Z) - Decouple knowledge from parameters for plug-and-play language modeling [77.5601135412186]
We introduce PlugLM, a pre-training model with differentiable plug-in memory(DPM)
The key intuition is to decouple the knowledge storage from model parameters with an editable and scalable key-value memory.
PlugLM obtains 3.95 F1 improvements across four domains on average without any in-domain pre-training.
arXiv Detail & Related papers (2023-05-19T10:01:55Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - Causality-aware Concept Extraction based on Knowledge-guided Prompting [17.4086571624748]
Concepts benefit natural language understanding but are far from complete in existing knowledge graphs (KGs)
Recently, pre-trained language models (PLMs) have been widely used in text-based concept extraction.
We propose equipping the PLM-based extractor with a knowledge-guided prompt as an intervention to alleviate concept bias.
arXiv Detail & Related papers (2023-05-03T03:36:20Z) - Context Variance Evaluation of Pretrained Language Models for
Prompt-based Biomedical Knowledge Probing [9.138354194112395]
We show that prompt-based probing methods can only probe a lower bound of knowledge.
We introduce context variance into the prompt generation and propose a new rank-change-based evaluation metric.
arXiv Detail & Related papers (2022-11-18T14:44:09Z) - Pre-training Language Models with Deterministic Factual Knowledge [42.812774794720895]
We propose to let PLMs learn the deterministic relationship between the remaining context and the masked content.
Two pre-training tasks are introduced to motivate PLMs to rely on the deterministic relationship when filling masks.
Experiments indicate that the continuously pre-trained PLMs achieve better robustness in factual knowledge capturing.
arXiv Detail & Related papers (2022-10-20T11:04:09Z) - Meta Knowledge Condensation for Federated Learning [65.20774786251683]
Existing federated learning paradigms usually extensively exchange distributed models at a central solver to achieve a more powerful model.
This would incur severe communication burden between a server and multiple clients especially when data distributions are heterogeneous.
Unlike existing paradigms, we introduce an alternative perspective to significantly decrease the communication cost in federate learning.
arXiv Detail & Related papers (2022-09-29T15:07:37Z) - A Simple but Effective Pluggable Entity Lookup Table for Pre-trained
Language Models [93.39977756450354]
We propose to build a simple but effective Pluggable Entity Lookup Table (PELT) on demand.
PELT can be compatibly plugged as inputs to infuse entity supplemental knowledge into pre-trained language models.
Experiments on knowledge-related tasks demonstrate that our method, PELT, can flexibly and effectively transfer entity knowledge from related corpora into PLMs.
arXiv Detail & Related papers (2022-02-27T16:30:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.