Knowledgeable Salient Span Mask for Enhancing Language Models as
Knowledge Base
- URL: http://arxiv.org/abs/2204.07994v2
- Date: Wed, 11 Oct 2023 05:12:10 GMT
- Title: Knowledgeable Salient Span Mask for Enhancing Language Models as
Knowledge Base
- Authors: Cunxiang Wang, Fuli Luo, Yanyang Li, Runxin Xu, Fei Huang and Yue
Zhang
- Abstract summary: We develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner.
To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training.
- Score: 51.55027623439027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models (PLMs) like BERT have made significant progress
in various downstream NLP tasks. However, by asking models to do cloze-style
tests, recent work finds that PLMs are short in acquiring knowledge from
unstructured text. To understand the internal behaviour of PLMs in retrieving
knowledge, we first define knowledge-baring (K-B) tokens and knowledge-free
(K-F) tokens for unstructured text and ask professional annotators to label
some samples manually. Then, we find that PLMs are more likely to give wrong
predictions on K-B tokens and attend less attention to those tokens inside the
self-attention module. Based on these observations, we develop two solutions to
help the model learn more knowledge from unstructured text in a fully
self-supervised manner. Experiments on knowledge-intensive tasks show the
effectiveness of the proposed methods. To our best knowledge, we are the first
to explore fully self-supervised learning of knowledge in continual
pre-training.
Related papers
- Knowledge Graph-Enhanced Large Language Models via Path Selection [58.228392005755026]
Large Language Models (LLMs) have shown unprecedented performance in various real-world applications.
LLMs are known to generate factually inaccurate outputs, a.k.a. the hallucination problem.
We propose a principled framework KELP with three stages to handle the above problems.
arXiv Detail & Related papers (2024-06-19T21:45:20Z) - TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models [31.209774088374374]
This paper introduces TRELM, a Robust and Efficient Pre-training framework for Knowledge-Enhanced Language Models.
We employ a robust approach to inject knowledge triples and employ a knowledge-augmented memory bank to capture valuable information.
We show that TRELM reduces pre-training time by at least 50% and outperforms other KEPLMs in knowledge probing tasks and multiple knowledge-aware language understanding tasks.
arXiv Detail & Related papers (2024-03-17T13:04:35Z) - Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - Knowledge Prompting in Pre-trained Language Model for Natural Language
Understanding [24.315130086787374]
We propose a knowledge-prompting-based PLM framework KP-PLM.
This framework can be flexibly combined with existing mainstream PLMs.
To further leverage the factual knowledge from these prompts, we propose two novel knowledge-aware self-supervised tasks.
arXiv Detail & Related papers (2022-10-16T13:36:57Z) - DictBERT: Dictionary Description Knowledge Enhanced Language Model
Pre-training via Contrastive Learning [18.838291575019504]
Pre-trained language models (PLMs) are shown to be lacking in knowledge when dealing with knowledge driven tasks.
We propose textbfDictBERT, a novel approach that enhances PLMs with dictionary knowledge.
We evaluate our approach on a variety of knowledge driven and language understanding tasks, including NER, relation extraction, CommonsenseQA, OpenBookQA and GLUE.
arXiv Detail & Related papers (2022-08-01T06:43:19Z) - TegTok: Augmenting Text Generation via Task-specific and Open-world
Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.