Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for
Knowledge-intensive Question Answering
- URL: http://arxiv.org/abs/2308.13259v2
- Date: Sat, 28 Oct 2023 12:19:29 GMT
- Title: Knowledge-Driven CoT: Exploring Faithful Reasoning in LLMs for
Knowledge-intensive Question Answering
- Authors: Keheng Wang, Feiyu Duan, Sirui Wang, Peiguang Li, Yunsen Xian,
Chuantao Yin, Wenge Rong, Zhang Xiong
- Abstract summary: Large language models (LLMs) equipped with Chain-of-Thought (CoT) have shown impressive reasoning ability in various downstream tasks.
We propose a framework called Knowledge-Driven Chain-of-Thought (KD-CoT) to verify and modify reasoning traces in CoT via interaction with external knowledge.
- Score: 17.672572064705445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Equipped with Chain-of-Thought (CoT), Large language models (LLMs) have shown
impressive reasoning ability in various downstream tasks. Even so, suffering
from hallucinations and the inability to access external knowledge, LLMs often
come with incorrect or unfaithful intermediate reasoning steps, especially in
the context of answering knowledge-intensive tasks such as KBQA. To alleviate
this issue, we propose a framework called Knowledge-Driven Chain-of-Thought
(KD-CoT) to verify and modify reasoning traces in CoT via interaction with
external knowledge, and thus overcome the hallucinations and error propagation.
Concretely, we formulate the CoT rationale process of LLMs into a structured
multi-round QA format. In each round, LLMs interact with a QA system that
retrieves external knowledge and produce faithful reasoning traces based on
retrieved precise answers. The structured CoT reasoning of LLMs is facilitated
by our developed KBQA CoT collection, which serves as in-context learning
demonstrations and can also be utilized as feedback augmentation to train a
robust retriever. Extensive experiments on WebQSP and ComplexWebQuestion
datasets demonstrate the effectiveness of proposed KD-CoT in task-solving
reasoning generation, which outperforms the vanilla CoT ICL with an absolute
success rate of 8.0% and 5.1%. Furthermore, our proposed feedback-augmented
retriever outperforms the state-of-the-art baselines for retrieving knowledge,
achieving significant improvement in Hit and recall performance. Our code and
data are released on https://github.com/AdelWang/KD-CoT/tree/main.
Related papers
- GIVE: Structured Reasoning with Knowledge Graph Inspired Veracity Extrapolation [108.2008975785364]
Graph Inspired Veracity Extrapolation (GIVE) is a novel reasoning framework that integrates the parametric and non-parametric memories.
Our method facilitates a more logical and step-wise reasoning approach akin to experts' problem-solving, rather than gold answer retrieval.
arXiv Detail & Related papers (2024-10-11T03:05:06Z) - Seek and Solve Reasoning for Table Question Answering [49.006950918895306]
This paper improves Table-based Question Answering (TQA) performance by leveraging Large Language Models' reasoning capabilities.
Inspired by how humans solve TQA tasks, we propose a Seek-and-seek pipeline that instructs the LLM to first seek relevant information and then answer questions.
We present a compact single-stage TQA-solving prompt distilled from the pipeline.
arXiv Detail & Related papers (2024-09-09T02:41:00Z) - CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs [27.362012903540492]
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning.
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning.
arXiv Detail & Related papers (2024-04-09T14:40:08Z) - Evidence-Focused Fact Summarization for Knowledge-Augmented Zero-Shot Question Answering [14.389264346634507]
We propose EFSum, an Evidence-focused Fact Summarization framework for enhanced Quesetion Answering (QA) performance.
Our experiments show that EFSum improves LLM's zero-shot QA performance.
arXiv Detail & Related papers (2024-03-05T13:43:58Z) - Direct Evaluation of Chain-of-Thought in Multi-hop Reasoning with Knowledge Graphs [52.42505579545893]
Large language models (LLMs) demonstrate strong reasoning abilities when prompted to generate chain-of-thought explanations alongside answers.
We propose a novel discriminative and generative CoT evaluation paradigm to assess LLMs' knowledge of reasoning and the accuracy of the generated CoT.
arXiv Detail & Related papers (2024-02-17T05:22:56Z) - Knowledge Verification to Nip Hallucination in the Bud [69.79051730580014]
We demonstrate the feasibility of mitigating hallucinations by verifying and minimizing the inconsistency between external knowledge present in the alignment data and the intrinsic knowledge embedded within foundation LLMs.
We propose a novel approach called Knowledge Consistent Alignment (KCA), which employs a well-aligned LLM to automatically formulate assessments based on external knowledge.
We demonstrate the superior efficacy of KCA in reducing hallucinations across six benchmarks, utilizing foundation LLMs of varying backbones and scales.
arXiv Detail & Related papers (2024-01-19T15:39:49Z) - keqing: knowledge-based question answering is a nature chain-of-thought
mentor of LLM [27.76205400533089]
Large language models (LLMs) have exhibited remarkable performance on various natural language processing (NLP) tasks, especially for question answering.
We present a novel framework to assist LLMs, such as ChatGPT, to retrieve question-related structured information on the knowledge graph.
The experimental results on KBQA datasets show that Keqing can achieve competitive performance and illustrate the logic of answering each question.
arXiv Detail & Related papers (2023-12-31T08:39:04Z) - Merging Generated and Retrieved Knowledge for Open-Domain QA [72.42262579925911]
COMBO is a compatibility-Oriented knowledge Merging for Better Open-domain QA framework.
We show that COMBO outperforms competitive baselines on three out of four tested open-domain QA benchmarks.
arXiv Detail & Related papers (2023-10-22T19:37:06Z) - Search-in-the-Chain: Interactively Enhancing Large Language Models with
Search for Knowledge-intensive Tasks [121.74957524305283]
This paper proposes a novel framework named textbfSearch-in-the-Chain (SearChain) for the interaction between Information Retrieval (IR) and Large Language Model (LLM)
Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks.
arXiv Detail & Related papers (2023-04-28T10:15:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.