Furthest Reasoning with Plan Assessment: Stable Reasoning Path with
Retrieval-Augmented Large Language Models
- URL: http://arxiv.org/abs/2309.12767v1
- Date: Fri, 22 Sep 2023 10:15:13 GMT
- Title: Furthest Reasoning with Plan Assessment: Stable Reasoning Path with
Retrieval-Augmented Large Language Models
- Authors: Yin Zhu, Zhiling Luo, Gong Cheng
- Abstract summary: Multi-Hop Question Answering (MHQA) stands as a widely discussed category.
Existing methods employ Large Language Models (LLMs) to generate reasoning paths and plans.
We propose a novel pipeline for MHQA called Furthest-Reasoning-with-Plan-Assessment (FuRePA)
- Score: 10.04323204974924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs), acting as a powerful reasoner and generator,
exhibit extraordinary performance across various natural language tasks, such
as question answering (QA). Among these tasks, Multi-Hop Question Answering
(MHQA) stands as a widely discussed category, necessitating seamless
integration between LLMs and the retrieval of external knowledge. Existing
methods employ LLM to generate reasoning paths and plans, and utilize IR to
iteratively retrieve related knowledge, but these approaches have inherent
flaws. On one hand, Information Retriever (IR) is hindered by the low quality
of generated queries by LLM. On the other hand, LLM is easily misguided by the
irrelevant knowledge by IR. These inaccuracies, accumulated by the iterative
interaction between IR and LLM, lead to a disaster in effectiveness at the end.
To overcome above barriers, in this paper, we propose a novel pipeline for MHQA
called Furthest-Reasoning-with-Plan-Assessment (FuRePA), including an improved
framework (Furthest Reasoning) and an attached module (Plan Assessor). 1)
Furthest reasoning operates by masking previous reasoning path and generated
queries for LLM, encouraging LLM generating chain of thought from scratch in
each iteration. This approach enables LLM to break the shackle built by
previous misleading thoughts and queries (if any). 2) The Plan Assessor is a
trained evaluator that selects an appropriate plan from a group of candidate
plans proposed by LLM. Our methods are evaluated on three highly recognized
public multi-hop question answering datasets and outperform state-of-the-art on
most metrics (achieving a 10%-12% in answer accuracy).
Related papers
- Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning [53.6472920229013]
Large Language Models (LLMs) have demonstrated impressive capability in many natural language tasks.
LLMs are prone to produce errors, hallucinations and inconsistent statements when performing multi-step reasoning.
We introduce Q*, a framework for guiding LLMs decoding process with deliberative planning.
arXiv Detail & Related papers (2024-06-20T13:08:09Z) - From Words to Actions: Unveiling the Theoretical Underpinnings of LLM-Driven Autonomous Systems [59.40480894948944]
Large language model (LLM) empowered agents are able to solve decision-making problems in the physical world.
Under this model, the LLM Planner navigates a partially observable Markov decision process (POMDP) by iteratively generating language-based subgoals via prompting.
We prove that the pretrained LLM Planner effectively performs Bayesian aggregated imitation learning (BAIL) through in-context learning.
arXiv Detail & Related papers (2024-05-30T09:42:54Z) - Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration [70.09561665520043]
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.
We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems.
Experiments on Over-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents.
arXiv Detail & Related papers (2024-05-23T08:33:19Z) - Reasoning on Efficient Knowledge Paths:Knowledge Graph Guides Large Language Model for Domain Question Answering [18.94220625114711]
Large language models (LLMs) perform surprisingly well and outperform human experts on many tasks.
This paper integrates and optimized a pipeline for selecting reasoning paths from KG based on LLM.
We also propose a simple and effective subgraph retrieval method based on chain of thought (CoT) and page rank.
arXiv Detail & Related papers (2024-04-16T08:28:16Z) - SatLM: Satisfiability-Aided Language Models Using Declarative Prompting [68.40726892904286]
We propose a new satisfiability-aided language modeling (SatLM) approach for improving the reasoning capabilities of large language models (LLMs)
We use an LLM to generate a declarative task specification rather than an imperative program and leverage an off-the-shelf automated theorem prover to derive the final answer.
We evaluate SATLM on 8 different datasets and show that it consistently outperforms program-aided LMs in the imperative paradigm.
arXiv Detail & Related papers (2023-05-16T17:55:51Z) - Search-in-the-Chain: Interactively Enhancing Large Language Models with
Search for Knowledge-intensive Tasks [121.74957524305283]
This paper proposes a novel framework named textbfSearch-in-the-Chain (SearChain) for the interaction between Information Retrieval (IR) and Large Language Model (LLM)
Experiments show that SearChain outperforms state-of-the-art baselines on complex knowledge-intensive tasks.
arXiv Detail & Related papers (2023-04-28T10:15:25Z) - Rethinking with Retrieval: Faithful Large Language Model Inference [91.66406351103484]
We propose a novel post-processing approach, rethinking with retrieval (RR)
RR retrieves relevant external knowledge based on the reasoning steps obtained from the chain-of-thought prompting.
We evaluate the effectiveness of RR through extensive experiments with GPT-3 on three complex reasoning tasks.
arXiv Detail & Related papers (2022-12-31T22:35:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.