Related papers: Synthetic Knowledge Ingestion: Towards Knowledge Refinement and Injection for Enhancing Large Language Models

Synthetic Knowledge Ingestion: Towards Knowledge Refinement and Injection for Enhancing Large Language Models

URL: http://arxiv.org/abs/2410.09629v1
Date: Sat, 12 Oct 2024 19:38:09 GMT
Title: Synthetic Knowledge Ingestion: Towards Knowledge Refinement and Injection for Enhancing Large Language Models
Authors: Jiaxin Zhang, Wendi Cui, Yiran Huang, Kamalika Das, Sricharan Kumar,
Abstract summary: Large language models (LLMs) are proficient in capturing factual knowledge across various domains. In this work, we propose a novel synthetic knowledge ingestion method called Ski. We then integrate Ski and its variations with three knowledge injection techniques to inject and refine knowledge in language models.
Score: 1.753683416932648
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large language models (LLMs) are proficient in capturing factual knowledge across various domains. However, refining their capabilities on previously seen knowledge or integrating new knowledge from external sources remains a significant challenge. In this work, we propose a novel synthetic knowledge ingestion method called Ski, which leverages fine-grained synthesis, interleaved generation, and assemble augmentation strategies to construct high-quality data representations from raw knowledge sources. We then integrate Ski and its variations with three knowledge injection techniques: Retrieval Augmented Generation (RAG), Supervised Fine-tuning (SFT), and Continual Pre-training (CPT) to inject and refine knowledge in language models. Extensive empirical experiments are conducted on various question-answering tasks spanning finance, biomedicine, and open-generation domains to demonstrate that Ski significantly outperforms baseline methods by facilitating effective knowledge injection. We believe that our work is an important step towards enhancing the factual accuracy of LLM outputs by refining knowledge representation and injection capabilities.

Related papers

Probing the Knowledge Boundary: An Interactive Agentic Framework for Deep Knowledge Extraction [29.717986496967978]
We propose an interactive agentic framework to systematically extract and quantify the knowledge of Large Language Models.<n>Our method includes four adaptive exploration policies to probe knowledge at different granularities.<n>We observe a clear knowledge scaling law, where larger models consistently extract more knowledge.
arXiv Detail & Related papers (2026-02-01T01:43:44Z)
Unveiling Knowledge Utilization Mechanisms in LLM-based Retrieval-Augmented Generation [77.10390725623125]
retrieval-augmented generation (RAG) is widely employed to expand their knowledge scope.<n>Since RAG has shown promise in knowledge-intensive tasks like open-domain question answering, its broader application to complex tasks and intelligent assistants has further advanced its utility.<n>We present a systematic investigation of the intrinsic mechanisms by which RAGs integrate internal (parametric) and external (retrieved) knowledge.
arXiv Detail & Related papers (2025-05-17T13:13:13Z)
WisdomBot: Tuning Large Language Models with Artificial Intelligence Knowledge [17.74988145184004]
Large language models (LLMs) have emerged as powerful tools in natural language processing (NLP) This paper presents a novel LLM for education named WisdomBot, which combines the power of LLMs with educational theories. We introduce two key enhancements during inference, i.e., local knowledge base retrieval augmentation and search engine retrieval augmentation during inference.
arXiv Detail & Related papers (2025-01-22T13:36:46Z)
KEHRL: Learning Knowledge-Enhanced Language Representations with Hierarchical Reinforcement Learning [32.086825891769585]
Knowledge-enhanced pre-trained language models (KEPLMs) leverage relation triples from knowledge graphs (KGs) Previous works treat knowledge enhancement as two independent operations, i.e., knowledge injection and knowledge integration. This paper jointly addresses the problems of detecting positions for knowledge injection and integrating external knowledge into the model in order to avoid injecting inaccurate or irrelevant knowledge.
arXiv Detail & Related papers (2024-06-24T07:32:35Z)
Injecting New Knowledge into Large Language Models via Supervised Fine-Tuning [13.371405067535814]
This paper investigates the effectiveness ofSupervised Fine-Tuning (SFT) as a method for knowledge injection in Large Language Models (LLMs) We compare different dataset generation strategies -- token-based and fact-based scaling -- to create training data that helps the model learn new information. Our results show considerable performance improvements in Q&A tasks related to out-of-domain knowledge.
arXiv Detail & Related papers (2024-03-30T01:56:07Z)
InfuserKI: Enhancing Large Language Models with Knowledge Graphs via Infuser-Guided Knowledge Integration [61.554209059971576]
Large Language Models (LLMs) have shown remarkable open-generation capabilities across diverse domains. Injecting new knowledge poses the risk of forgetting previously acquired knowledge. We propose a novel Infuser-Guided Knowledge Integration framework.
arXiv Detail & Related papers (2024-02-18T03:36:26Z)
A Comprehensive Study of Knowledge Editing for Large Language Models [82.65729336401027]
Large Language Models (LLMs) have shown extraordinary capabilities in understanding and generating text that closely mirrors human communication. This paper defines the knowledge editing problem and provides a comprehensive review of cutting-edge approaches. We introduce a new benchmark, KnowEdit, for a comprehensive empirical evaluation of representative knowledge editing approaches.
arXiv Detail & Related papers (2024-01-02T16:54:58Z)
Beyond Factuality: A Comprehensive Evaluation of Large Language Models as Knowledge Generators [78.63553017938911]
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks. However, community concerns abound regarding the factuality and potential implications of using this uncensored knowledge. We introduce CONNER, designed to evaluate generated knowledge from six important perspectives.
arXiv Detail & Related papers (2023-10-11T08:22:37Z)
Knowledge Rumination for Pre-trained Language Models [77.55888291165462]
We propose a new paradigm dubbed Knowledge Rumination to help the pre-trained language model utilize related latent knowledge without retrieving it from the external corpus. We apply the proposed knowledge rumination to various language models, including RoBERTa, DeBERTa, and GPT-3.
arXiv Detail & Related papers (2023-05-15T15:47:09Z)
UNTER: A Unified Knowledge Interface for Enhancing Pre-trained Language Models [100.4659557650775]
We propose a UNified knowledge inTERface, UNTER, to provide a unified perspective to exploit both structured knowledge and unstructured knowledge. With both forms of knowledge injected, UNTER gains continuous improvements on a series of knowledge-driven NLP tasks.
arXiv Detail & Related papers (2023-05-02T17:33:28Z)
LM-CORE: Language Models with Contextually Relevant External Knowledge [13.451001884972033]
We argue that storing large amounts of knowledge in the model parameters is sub-optimal given the ever-growing amounts of knowledge and resource requirements. We present LM-CORE -- a general framework to achieve this -- that allows textitdecoupling of the language model training from the external knowledge source. Experimental results show that LM-CORE, having access to external knowledge, achieves significant and robust outperformance over state-of-the-art knowledge-enhanced language models on knowledge probing tasks.
arXiv Detail & Related papers (2022-08-12T18:59:37Z)
Ontology-enhanced Prompt-tuning for Few-shot Learning [41.51144427728086]
Few-shot Learning is aimed to make predictions based on a limited number of samples. Structured data such as knowledge graphs and ontology libraries has been leveraged to benefit the few-shot setting in various tasks.
arXiv Detail & Related papers (2022-01-27T05:41:36Z)
Kformer: Knowledge Injection in Transformer Feed-Forward Layers [107.71576133833148]
We propose a novel knowledge fusion model, namely Kformer, which incorporates external knowledge through the feed-forward layer in Transformer. We empirically find that simply injecting knowledge into FFN can facilitate the pre-trained language model's ability and facilitate current knowledge fusion methods.
arXiv Detail & Related papers (2022-01-15T03:00:27Z)
DKPLM: Decomposable Knowledge-enhanced Pre-trained Language Model for Natural Language Understanding [19.478288026844893]
Knowledge-Enhanced Pre-trained Language Models (KEPLMs) are pre-trained models with relation triples injecting from knowledge graphs to improve language understanding abilities. Previous studies integrate models with knowledge encoders for representing knowledge retrieved from knowledge graphs. We propose a novel KEPLM named DKPLM that Decomposes Knowledge injection process of the Pre-trained Language Models in pre-training, fine-tuning and inference stages.
arXiv Detail & Related papers (2021-12-02T08:19:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.