DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for
Natural Language Understanding
- URL: http://arxiv.org/abs/2112.01047v1
- Date: Thu, 2 Dec 2021 08:19:42 GMT
- Title: DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for
Natural Language Understanding
- Authors: Taolin Zhang, Chengyu Wang, Nan Hu, Minghui Qiu, Chengguang Tang,
Xiaofeng He, Jun Huang
- Abstract summary: Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities.
Previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs.
We propose a novel KEPLM named DKPLM that Decomposes Knowledge injection process of the Pre-trained Language Models in pre-training, fine-tuning and inference stages.
- Score: 19.478288026844893
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained
models with relation triples injecting from knowledge graphs to improve
language understanding abilities. To guarantee effective knowledge injection,
previous studies integrate models with knowledge encoders for representing
knowledge retrieved from knowledge graphs. The operations for knowledge
retrieval and encoding bring significant computational burdens, restricting the
usage of such models in real-world applications that require high inference
speed. In this paper, we propose a novel KEPLM named DKPLM that Decomposes
Knowledge injection process of the Pre-trained Language Models in pre-training,
fine-tuning and inference stages, which facilitates the applications of KEPLMs
in real-world scenarios. Specifically, we first detect knowledge-aware
long-tail entities as the target for knowledge injection, enhancing the KEPLMs'
semantic understanding abilities and avoiding injecting redundant information.
The embeddings of long-tail entities are replaced by "pseudo token
representations" formed by relevant knowledge triples. We further design the
relational knowledge decoding task for pre-training to force the models to
truly understand the injected knowledge by relation triple reconstruction.
Experiments show that our model outperforms other KEPLMs significantly over
zero-shot knowledge probing tasks and multiple knowledge-aware language
understanding tasks. We further show that DKPLM has a higher inference speed
than other competing models due to the decomposing mechanism.
Related papers
- KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement Learning [32.086825891769585]
Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs)
Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration.
This paper jointly addresses the problems of detecting positions for knowledge injection and integrating external knowledge into the model in order to avoid injecting inaccurate or irrelevant knowledge.
arXiv Detail & Related papers (2024-06-24T07:32:35Z) - Large Language Models are Limited in Out-of-Context Knowledge Reasoning [65.72847298578071]
Large Language Models (LLMs) possess extensive knowledge and strong capabilities in performing in-context reasoning.
This paper focuses on a significant aspect of out-of-context reasoning: Out-of-Context Knowledge Reasoning (OCKR), which is to combine multiple knowledge to infer new knowledge.
arXiv Detail & Related papers (2024-06-11T15:58:59Z) - TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models [31.209774088374374]
This paper introduces TRELM, a Robust and Efficient Pre-training framework for Knowledge-Enhanced Language Models.
We employ a robust approach to inject knowledge triples and employ a knowledge-augmented memory bank to capture valuable information.
We show that TRELM reduces pre-training time by at least 50% and outperforms other KEPLMs in knowledge probing tasks and multiple knowledge-aware language understanding tasks.
arXiv Detail & Related papers (2024-03-17T13:04:35Z) - Commonsense Knowledge Transfer for Pre-trained Language Models [83.01121484432801]
We introduce commonsense knowledge transfer, a framework to transfer the commonsense knowledge stored in a neural commonsense knowledge model to a general-purpose pre-trained language model.
It first exploits general texts to form queries for extracting commonsense knowledge from the neural commonsense knowledge model.
It then refines the language model with two self-supervised objectives: commonsense mask infilling and commonsense relation prediction.
arXiv Detail & Related papers (2023-06-04T15:44:51Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z) - A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models [185.08295787309544]
We aim to summarize the current progress of pre-trained language model-based knowledge-enhanced models (PLMKEs)
We present the challenges of PLMKEs based on the discussion regarding the three elements and attempt to provide NLP practitioners with potential directions for further research.
arXiv Detail & Related papers (2022-02-17T17:17:43Z) - Generated Knowledge Prompting for Commonsense Reasoning [53.88983683513114]
We propose generating knowledge statements directly from a language model with a generic prompt format.
This approach improves performance of both off-the-shelf and finetuned language models on four commonsense reasoning tasks.
Notably, we find that a model's predictions can improve when using its own generated knowledge.
arXiv Detail & Related papers (2021-10-15T21:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.