Knowledge Rumination for Pre-trained Language Models
- URL: http://arxiv.org/abs/2305.08732v3
- Date: Wed, 11 Oct 2023 10:51:12 GMT
- Title: Knowledge Rumination for Pre-trained Language Models
- Authors: Yunzhi Yao, Peng Wang, Shengyu Mao, Chuanqi Tan, Fei Huang, Huajun
Chen, Ningyu Zhang
- Abstract summary: We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus.
We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
- Score: 77.55888291165462
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous studies have revealed that vanilla pre-trained language models
(PLMs) lack the capacity to handle knowledge-intensive NLP tasks alone; thus,
several works have attempted to integrate external knowledge into PLMs.
However, despite the promising outcome, we empirically observe that PLMs may
have already encoded rich knowledge in their pre-trained parameters but fail to
fully utilize them when applying them to knowledge-intensive tasks. In this
paper, we propose a new paradigm dubbed Knowledge Rumination to help the
pre-trained language model utilize that related latent knowledge without
retrieving it from the external corpus. By simply adding a prompt like "As far
as I know" to the PLMs, we try to review related latent knowledge and inject
them back into the model for knowledge consolidation. We apply the proposed
knowledge rumination to various language models, including RoBERTa, DeBERTa,
and GPT-3. Experimental results on six commonsense reasoning tasks and GLUE
benchmarks demonstrate the effectiveness of our proposed approach, which proves
that the knowledge stored in PLMs can be better exploited to enhance
performance. Code is available in
https://github.com/zjunlp/knowledge-rumination.
Related papers
- IAO Prompting: Making Knowledge Flow Explicit in LLMs through Structured Reasoning Templates [7.839338724237275]
We introduce IAO (Input-Action-Output), a structured template-based method that explicitly models how Large Language Models access and apply their knowledge.
IAO decomposes problems into sequential steps, each clearly identifying the input knowledge being used, the action being performed, and the resulting output.
Our findings provide insights into both knowledge representation within LLMs and methods for more reliable knowledge application.
arXiv Detail & Related papers (2025-02-05T11:14:20Z) - KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning [74.21524111840652]
This paper proposes textbfKaLM, a textitKnowledge-aligned Language Modeling approach.
It fine-tunes autoregressive large language models to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment.
Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks.
arXiv Detail & Related papers (2024-12-06T11:08:24Z) - Self-Knowledge Guided Retrieval Augmentation for Large Language Models [59.771098292611846]
Large language models (LLMs) have shown superior performance without task-specific fine-tuning.
Retrieval-based methods can offer non-parametric world knowledge and improve the performance on tasks such as question answering.
Self-Knowledge guided Retrieval augmentation (SKR) is a simple yet effective method which can let LLMs refer to the questions they have previously encountered.
arXiv Detail & Related papers (2023-10-08T04:22:33Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - A Survey of Knowledge Enhanced Pre-trained Language Models [78.56931125512295]
We present a comprehensive review of Knowledge Enhanced Pre-trained Language Models (KE-PLMs)
For NLU, we divide the types of knowledge into four categories: linguistic knowledge, text knowledge, knowledge graph (KG) and rule knowledge.
The KE-PLMs for NLG are categorized into KG-based and retrieval-based methods.
arXiv Detail & Related papers (2022-11-11T04:29:02Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z) - MLRIP: Pre-training a military language representation model with
informative factual knowledge and professional knowledge base [11.016827497014821]
Current pre-training procedures usually inject external knowledge into models by using knowledge masking, knowledge fusion and knowledge replacement.
We propose MLRIP, which modifies the knowledge masking strategies proposed by ERNIE-Baidu, and introduce a two-stage entity replacement strategy.
Extensive experiments with comprehensive analyses illustrate the superiority of MLRIP over BERT-based models in military knowledge-driven NLP tasks.
arXiv Detail & Related papers (2022-07-28T07:39:30Z) - Knowledgeable Salient Span Mask for Enhancing Language Models as
Knowledge Base [51.55027623439027]
We develop two solutions to help the model learn more knowledge from unstructured text in a fully self-supervised manner.
To our best knowledge, we are the first to explore fully self-supervised learning of knowledge in continual pre-training.
arXiv Detail & Related papers (2022-04-17T12:33:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.