Knowledge Infused Decoding
- URL: http://arxiv.org/abs/2204.03084v1
- Date: Wed, 6 Apr 2022 20:58:32 GMT
- Title: Knowledge Infused Decoding
- Authors: Ruibo Liu, Guoqing Zheng, Shashank Gupta, Radhika Gaonkar, Chongyang
Gao, Soroush Vosoughi, Milad Shokouhi, Ahmed Hassan Awadallah
- Abstract summary: Knowledge Infused Decoding (KID) is a novel decoding algorithm for generative language models (LMs)
KID dynamically infuses external knowledge into each step of the LM decoding.
Human evaluation confirms KID's ability to generate more relevant and factual language for the input context.
- Score: 46.09844215234235
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Pre-trained language models (LMs) have been shown to memorize a substantial
amount of knowledge from the pre-training corpora; however, they are still
limited in recalling factually correct knowledge given a certain context.
Hence, they tend to suffer from counterfactual or hallucinatory generation when
used in knowledge-intensive natural language generation (NLG) tasks. Recent
remedies to this problem focus on modifying either the pre-training or task
fine-tuning objectives to incorporate knowledge, which normally require
additional costly training or architecture modification of LMs for practical
applications. We present Knowledge Infused Decoding (KID) -- a novel decoding
algorithm for generative LMs, which dynamically infuses external knowledge into
each step of the LM decoding. Specifically, we maintain a local knowledge
memory based on the current context, interacting with a dynamically created
external knowledge trie, and continuously update the local memory as a
knowledge-aware constraint to guide decoding via reinforcement learning. On six
diverse knowledge-intensive NLG tasks, task-agnostic LMs (e.g., GPT-2 and BART)
armed with KID outperform many task-optimized state-of-the-art models, and show
particularly strong performance in few-shot scenarios over seven related
knowledge-infusion techniques. Human evaluation confirms KID's ability to
generate more relevant and factual language for the input context when compared
with multiple baselines. Finally, KID also alleviates exposure bias and
provides stable generation quality when generating longer sequences. Code for
KID is available at https://github.com/microsoft/KID.
Related papers
- Efficient Tuning of Large Language Models for Knowledge-Grounded Dialogue Generation [21.52726424882653]
We introduce KEDiT, an efficient method for fine-tuning large language models for knowledge-grounded dialogue generation.
KEDiT operates in two main phases: first, it employs an information bottleneck to compress retrieved knowledge into learnable parameters, retaining essential information while minimizing computational overhead.
experimental results on the Wizard of Wikipedia and a newly constructed PubMed-Dialog dataset demonstrate that KEDiT excels in generating contextually relevant and informative responses.
arXiv Detail & Related papers (2025-04-10T13:54:36Z) - Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating [61.70705744491162]
Large Language Models (LLMs) excel in natural language processing by encoding extensive human knowledge.
Updating LLMs involves two key tasks simultaneously: unlearning to remove unwanted knowledge and editing to incorporate new information.
We propose LOKA, a conflict-free framework for LLM updating based on a knowledge codebook.
arXiv Detail & Related papers (2025-01-31T20:48:46Z) - KaLM: Knowledge-aligned Autoregressive Language Modeling via Dual-view Knowledge Graph Contrastive Learning [74.21524111840652]
This paper proposes textbfKaLM, a textitKnowledge-aligned Language Modeling approach.
It fine-tunes autoregressive large language models to align with KG knowledge via the joint objective of explicit knowledge alignment and implicit knowledge alignment.
Notably, our method achieves a significant performance boost in evaluations of knowledge-driven tasks.
arXiv Detail & Related papers (2024-12-06T11:08:24Z) - KIF: Knowledge Identification and Fusion for Language Model Continual Learning [41.28933724210434]
We introduce a novel framework for language models, named Knowledge Identification and Fusion (KIF)
KIF segregates the model into'skill units' based on parameter dependencies, allowing for more precise control.
It employs a novel group-wise knowledge identification technique to ascertain the importance distribution of skill units for a new task.
As a result, KIF achieves an optimal balance between retaining prior knowledge and excelling in new tasks.
arXiv Detail & Related papers (2024-08-09T17:44:45Z) - KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement Learning [32.086825891769585]
Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs)
Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration.
This paper jointly addresses the problems of detecting positions for knowledge injection and integrating external knowledge into the model in order to avoid injecting inaccurate or irrelevant knowledge.
arXiv Detail & Related papers (2024-06-24T07:32:35Z) - TRELM: Towards Robust and Efficient Pre-training for Knowledge-Enhanced Language Models [31.209774088374374]
This paper introduces TRELM, a Robust and Efficient Pre-training framework for Knowledge-Enhanced Language Models.
We employ a robust approach to inject knowledge triples and employ a knowledge-augmented memory bank to capture valuable information.
We show that TRELM reduces pre-training time by at least 50% and outperforms other KEPLMs in knowledge probing tasks and multiple knowledge-aware language understanding tasks.
arXiv Detail & Related papers (2024-03-17T13:04:35Z) - InfuserKI: Enhancing Large Language Models with Knowledge Graphs via
Infuser-Guided Knowledge Integration [61.554209059971576]
Large Language Models (LLMs) have shown remarkable open-generation capabilities across diverse domains.
Injecting new knowledge poses the risk of forgetting previously acquired knowledge.
We propose a novel Infuser-Guided Knowledge Integration framework.
arXiv Detail & Related papers (2024-02-18T03:36:26Z) - A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication.
This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches.
We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z) - KnowledGPT: Enhancing Large Language Models with Retrieval and Storage
Access on Knowledge Bases [55.942342665806656]
KnowledGPT is a comprehensive framework to bridge large language models with various knowledge bases.
The retrieval process employs the program of thought prompting, which generates search language for KBs in code format.
KnowledGPT offers the capability to store knowledge in a personalized KB, catering to individual user demands.
arXiv Detail & Related papers (2023-08-17T13:07:00Z) - UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language
Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge.
With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z) - KILM: Knowledge Injection into Encoder-Decoder Language Models [26.44077668498835]
Large pre-trained language models (PLMs) have been shown to retain implicit knowledge within their parameters.
We propose Knowledge Injection into Language Models (KILM), a novel approach that injects entity-related knowledge into encoder-decoder PLMs.
KILM enables models to retain more knowledge and hallucinate less, while preserving their original performance on general NLU and NLG tasks.
arXiv Detail & Related papers (2023-02-17T22:48:07Z) - LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements.
We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source.
Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z) - TegTok: Augmenting Text Generation via Task-specific and Open-world
Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z) - DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for
Natural Language Understanding [19.478288026844893]
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities.
Previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs.
We propose a novel KEPLM named DKPLM that Decomposes Knowledge injection process of the Pre-trained Language Models in pre-training, fine-tuning and inference stages.
arXiv Detail & Related papers (2021-12-02T08:19:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.