Related papers: Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

URL: http://arxiv.org/abs/2212.10509v2
Date: Fri, 23 Jun 2023 00:59:13 GMT
Title: Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
Authors: Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal
Abstract summary: We propose IRCoT, a new approach for multi-step question answering. It interleaves retrieval with steps in a CoT, guiding the retrieval with CoT and in turn using retrieved results to improve CoT.
Score: 50.114651561111245
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Prompting-based large language models (LLMs) are surprisingly powerful at generating natural language reasoning steps or Chains-of-Thoughts (CoT) for multi-step question answering (QA). They struggle, however, when the necessary knowledge is either unavailable to the LLM or not up-to-date within its parameters. While using the question to retrieve relevant text from an external knowledge source helps LLMs, we observe that this one-step retrieve-and-read approach is insufficient for multi-step QA. Here, \textit{what to retrieve} depends on \textit{what has already been derived}, which in turn may depend on \textit{what was previously retrieved}. To address this, we propose IRCoT, a new approach for multi-step QA that interleaves retrieval with steps (sentences) in a CoT, guiding the retrieval with CoT and in turn using retrieved results to improve CoT. Using IRCoT with GPT3 substantially improves retrieval (up to 21 points) as well as downstream QA (up to 15 points) on four datasets: HotpotQA, 2WikiMultihopQA, MuSiQue, and IIRC. We observe similar substantial gains in out-of-distribution (OOD) settings as well as with much smaller models such as Flan-T5-large without additional training. IRCoT reduces model hallucination, resulting in factually more accurate CoT reasoning. Code, data, and prompts are available at \url{https://github.com/stonybrooknlp/ircot}

Related papers

Hybrid Graphs for Table-and-Text based Question Answering using LLMs [2.3759432635713895]
We present a novel Hybrid Graph-based approach for Table-Text QA. We evaluate our approach on the challenging Hybrid-QA and OTT-QA datasets. Our method achieves the best zero-shot performance on both datasets.
arXiv Detail & Related papers (2025-01-29T16:58:18Z)
AutoReason: Automatic Few-Shot Reasoning Decomposition [0.0]
Chain of Thought (CoT) was introduced in recent research as a method for improving step-by-step reasoning in Large Language Models. We propose a system to automatically generate rationales using CoT. Our method improves multi-step implicit reasoning capabilities by decomposing the implicit query into several explicit questions.
arXiv Detail & Related papers (2024-12-09T20:35:39Z)
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs [9.559336828884808]
Large Language Models (LLMs) are adept at answering simple (single-hop) questions. As the complexity of the questions increase, the performance of LLMs degrades. Recent methods try to reduce this burden by integrating structured knowledge triples into the raw text. We propose to use a knowledge graph (KG) that is context-aware and is distilled to contain query-relevant information.
arXiv Detail & Related papers (2024-06-10T05:22:49Z)
SuRe: Summarizing Retrievals using Answer Candidates for Open-domain QA of LLMs [85.54906813106683]
We propose a simple yet effective framework to enhance open-domain question answering (ODQA) with large language models (LLMs) SuRe helps LLMs predict more accurate answers for a given question, which are well-supported by the summarized retrieval (SuRe) Experimental results on diverse ODQA benchmarks demonstrate the superiority of SuRe, with improvements of up to 4.6% in exact match (EM) and 4.0% in F1 score over standard prompting approaches.
arXiv Detail & Related papers (2024-04-17T01:15:54Z)
MFORT-QA: Multi-hop Few-shot Open Rich Table Question Answering [3.1651118728570635]
In today's fast-paced industry, professionals face the challenge of summarizing a large number of documents and extracting vital information from them on a daily basis. To address this challenge, the approach of Table Question Answering (QA) has been developed to extract the relevant information. Recent advancements in Large Language Models (LLMs) have opened up new possibilities for extracting information from tabular data using prompts.
arXiv Detail & Related papers (2024-03-28T03:14:18Z)
Probabilistic Tree-of-thought Reasoning for Answering Knowledge-intensive Complex Questions [93.40614719648386]
Large language models (LLMs) are capable of answering knowledge-intensive complex questions with chain-of-thought (CoT) reasoning. Recent works turn to retrieving external knowledge to augment CoT reasoning. We propose a novel approach: Probabilistic Tree-of-thought Reasoning (ProbTree)
arXiv Detail & Related papers (2023-11-23T12:52:37Z)
Graph Elicitation for Guiding Multi-Step Reasoning in Large Language Models [16.432208223793666]
Chain-of-Thought prompting along with sub-question generation and answering has enhanced multi-step reasoning capabilities. We propose a GE-Reasoning method, which directs Large Language Models to generate proper sub-questions and corresponding answers. Our approach outperforms previous CoT prompting methods and their variants on multi-hop question answering benchmark datasets.
arXiv Detail & Related papers (2023-11-16T10:36:08Z)
SEMQA: Semi-Extractive Multi-Source Question Answering [94.04430035121136]
We introduce a new QA task for answering multi-answer questions by summarizing multiple diverse sources in a semi-extractive fashion. We create the first dataset of this kind, QuoteSum, with human-written semi-extractive answers to natural and generated questions.
arXiv Detail & Related papers (2023-11-08T18:46:32Z)
Self-Prompting Large Language Models for Zero-Shot Open-Domain QA [67.08732962244301]
Open-Domain Question Answering (ODQA) aims to answer questions without explicitly providing background documents. This task becomes notably challenging in a zero-shot setting where no data is available to train tailored retrieval-reader models. We propose a Self-Prompting framework to explicitly utilize the massive knowledge encoded in the parameters of Large Language Models.
arXiv Detail & Related papers (2022-12-16T18:23:43Z)
Calculating Question Similarity is Enough:A New Method for KBQA Tasks [8.056701645706404]
This paper proposes a Corpus Generation - Retrieve Method (CGRM) with Pre-training Language Model (PLM) and Knowledge Graph (KG) Firstly, based on the mT5 model, we designed two new pre-training tasks: knowledge masked language modeling and question generation based on the paragraph. Secondly, after preprocessing triples of knowledge graph with a series of rules, the kT5 model generates natural language QA pairs based on processed triples.
arXiv Detail & Related papers (2021-11-15T10:31:46Z)
Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question. Most open QA systems have considered only retrieving information from unstructured text. We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.