Chain-of-Knowledge: Grounding Large Language Models via Dynamic
Knowledge Adapting over Heterogeneous Sources
- URL: http://arxiv.org/abs/2305.13269v4
- Date: Wed, 21 Feb 2024 07:44:48 GMT
- Title: Chain-of-Knowledge: Grounding Large Language Models via Dynamic
Knowledge Adapting over Heterogeneous Sources
- Authors: Xingxuan Li, Ruochen Zhao, Yew Ken Chia, Bosheng Ding, Shafiq Joty,
Soujanya Poria, Lidong Bing
- Abstract summary: Chain-of-knowledge (CoK) is a framework that augments large language models.
CoK consists of three stages: reasoning preparation, dynamic knowledge adapting, and answer consolidation.
- Score: 87.26486246513063
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present chain-of-knowledge (CoK), a novel framework that augments large
language models (LLMs) by dynamically incorporating grounding information from
heterogeneous sources. It results in more factual rationales and reduced
hallucination in generation. Specifically, CoK consists of three stages:
reasoning preparation, dynamic knowledge adapting, and answer consolidation.
Given a knowledge-intensive question, CoK first prepares several preliminary
rationales and answers while identifying the relevant knowledge domains. If
there is no majority consensus among the answers from samples, CoK corrects the
rationales step by step by adapting knowledge from the identified domains.
These corrected rationales can plausibly serve as a better foundation for the
final answer consolidation. Unlike prior studies that primarily use
unstructured data, CoK also leverages structured knowledge sources such as
Wikidata and tables that provide more reliable factual information. To access
both unstructured and structured knowledge sources in the dynamic knowledge
adapting stage, we propose an adaptive query generator that allows the
generation of queries for various types of query languages, including SPARQL,
SQL, and natural sentences. Moreover, to minimize error propagation between
rationales, CoK corrects the rationales progressively using preceding corrected
rationales to generate and correct subsequent rationales. Extensive experiments
show that CoK consistently improves the performance of LLMs on
knowledge-intensive tasks across different domains.
Related papers
- SPARKLE: Enhancing SPARQL Generation with Direct KG Integration in Decoding [0.46040036610482665]
We present a novel end-to-end natural language to SPARQL framework, SPARKLE.
SPARKLE leverages the structure of knowledge base directly during the decoding, effectively integrating knowledge into the query generation.
We show that SPARKLE achieves new state-of-the-art results on SimpleQuestions-Wiki and highest F1 score on LCQuAD 1.0.
arXiv Detail & Related papers (2024-06-29T06:43:11Z) - Leveraging Logical Rules in Knowledge Editing: A Cherry on the Top [12.982138813457812]
Multi-hop Question Answering (MQA) under knowledge editing (KE) is a key challenge in Large Language Models (LLMs)
We propose a novel framework named RULE-KE, i.e., RULE based Knowledge Editing, which is a cherry on the top for augmenting the performance of all existing MQA methods under KE.
Experimental evaluation using existing and newly curated datasets shows that RULE-KE helps augment both performances of parameter-based and memory-based solutions up to 92% and 112.9%, respectively.
arXiv Detail & Related papers (2024-05-24T11:30:00Z) - Enhancing Formal Theorem Proving: A Comprehensive Dataset for Training AI Models on Coq Code [0.0]
The Coq proof assistant stands out for its rigorous approach to verifying mathematical assertions and software correctness.
Despite the advances in artificial intelligence and machine learning, the specialized nature of Coq syntax and semantics poses unique challenges for Large Language Models (LLMs)
This dataset, derived from a collection of over 10,000 Coq source files, encompasses a wide array of propositions, proofs, and definitions.
arXiv Detail & Related papers (2024-03-19T10:53:40Z) - A Knowledge-Injected Curriculum Pretraining Framework for Question Answering [70.13026036388794]
We propose a general Knowledge-Injected Curriculum Pretraining framework (KICP) to achieve comprehensive KG learning and exploitation for Knowledge-based question answering tasks.
The KI module first injects knowledge into the LM by generating KG-centered pretraining corpus, and generalizes the process into three key steps.
The KA module learns knowledge from the generated corpus with LM equipped with an adapter as well as keeps its original natural language understanding ability.
The CR module follows human reasoning patterns to construct three corpora with increasing difficulties of reasoning, and further trains the LM from easy to hard in a curriculum manner.
arXiv Detail & Related papers (2024-03-11T03:42:03Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Merging Generated and Retrieved Knowledge for Open-Domain QA [72.42262579925911]
COMBO is a compatibility-Oriented knowledge Merging for Better Open-domain QA framework.
We show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks.
arXiv Detail & Related papers (2023-10-22T19:37:06Z) - Reasoning over Hierarchical Question Decomposition Tree for Explainable
Question Answering [83.74210749046551]
We propose to leverage question decomposing for heterogeneous knowledge integration.
We propose a novel two-stage XQA framework, Reasoning over Hierarchical Question Decomposition Tree (RoHT)
Experiments on complex QA datasets KQA Pro and Musique show that our framework outperforms SOTA methods significantly.
arXiv Detail & Related papers (2023-05-24T11:45:59Z) - DecAF: Joint Decoding of Answers and Logical Forms for Question
Answering over Knowledge Bases [81.19499764899359]
We propose a novel framework DecAF that jointly generates both logical forms and direct answers.
DecAF achieves new state-of-the-art accuracy on WebQSP, FreebaseQA, and GrailQA benchmarks.
arXiv Detail & Related papers (2022-09-30T19:51:52Z) - Unified Open-Domain Question Answering with Structured and Unstructured
Knowledge [7.7429684536437104]
We study open-domain question answering (ODQA) with structured, unstructured and semi-structured knowledge sources.
Our approach homogenizes all sources by reducing them to text, and applies recent, powerful retriever-reader models.
As a result, our unified model produces state-of-the-art results on 3 popular ODQA benchmarks.
arXiv Detail & Related papers (2020-12-29T05:14:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.