End-to-End Beam Retrieval for Multi-Hop Question Answering
- URL: http://arxiv.org/abs/2308.08973v2
- Date: Mon, 1 Apr 2024 08:30:38 GMT
- Title: End-to-End Beam Retrieval for Multi-Hop Question Answering
- Authors: Jiahao Zhang, Haiyang Zhang, Dongmei Zhang, Yong Liu, Shen Huang,
- Abstract summary: Multi-hop question answering involves finding multiple relevant passages and step-by-step reasoning to answer complex questions.
Previous retrievers were customized for two-hop questions, and most of them were trained separately across different hops.
We introduce Beam Retrieval, an end-to-end beam retrieval framework for multi-hop QA.
- Score: 37.13580394608824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-hop question answering (QA) involves finding multiple relevant passages and step-by-step reasoning to answer complex questions, indicating a retrieve-and-read paradigm. However, previous retrievers were customized for two-hop questions, and most of them were trained separately across different hops, resulting in a lack of supervision over the entire multi-hop retrieval process and leading to poor performance in complicated scenarios beyond two hops. In this work, we introduce Beam Retrieval, an end-to-end beam retrieval framework for multi-hop QA. This approach models the multi-hop retrieval process in an end-to-end manner by jointly optimizing an encoder and two classification heads across all hops. Moreover, Beam Retrieval maintains multiple partial hypotheses of relevant passages at each step, expanding the search space and reducing the risk of missing relevant passages. To establish a complete QA system, we incorporate a supervised reader or a large language model (LLM). Experimental results demonstrate that Beam Retrieval achieves a nearly 50% improvement compared with baselines on challenging MuSiQue-Ans, and it also surpasses all previous retrievers on HotpotQA and achieves 99.9% precision on 2WikiMultiHopQA. Providing high-quality context, Beam Retrieval helps our supervised reader achieve new state-of-the-art performance and substantially improves the few-shot QA performance of LLMs.
Related papers
- Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach [6.549143816134531]
We propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer.
Experimental results on the multi-hop question-answering HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2024-07-18T02:19:00Z) - Performance Prediction for Multi-hop Questions [7.388002745070808]
We propose multHP, a novel pre-retrieval method for predicting the performance of open-domain multi-hop questions.
Our evaluation shows that the proposed model is a strong predictor of the performance, outperforming traditional single-hop QPP models.
arXiv Detail & Related papers (2023-08-12T01:34:41Z) - Rethinking Label Smoothing on Multi-hop Question Answering [87.68071401870283]
Multi-Hop Question Answering (MHQA) is a significant area in question answering.
In this work, we analyze the primary factors limiting the performance of multi-hop reasoning.
We propose a novel label smoothing technique, F1 Smoothing, which incorporates uncertainty into the learning process.
arXiv Detail & Related papers (2022-12-19T14:48:08Z) - Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question
Answering [71.49131159045811]
Multi-hop reasoning requires aggregating multiple documents to answer a complex question.
Existing methods usually decompose the multi-hop question into simpler single-hop questions.
We propose an interpretable stepwise reasoning framework to incorporate both single-hop supporting sentence identification and single-hop question generation.
arXiv Detail & Related papers (2022-08-22T13:24:25Z) - From Easy to Hard: Two-stage Selector and Reader for Multi-hop Question
Answering [12.072618400000763]
Multi-hop question answering (QA) is a challenging task requiring QA systems to perform complex reasoning over multiple documents.
We propose a novel framework, From Easy to Hard (FE2H), to remove distracting information and obtain better contextual representations.
FE2H divides both the document selector and reader into two stages following an easy-to-hard manner.
arXiv Detail & Related papers (2022-05-24T02:33:58Z) - Modeling Multi-hop Question Answering as Single Sequence Prediction [88.72621430714985]
We propose a simple generative approach (PathFid) that extends the task beyond just answer generation.
PathFid explicitly models the reasoning process to resolve the answer for multi-hop questions.
Our experiments demonstrate that PathFid leads to strong performance gains on two multi-hop QA datasets.
arXiv Detail & Related papers (2022-05-18T21:57:59Z) - Decomposing Complex Questions Makes Multi-Hop QA Easier and More
Interpretable [25.676852169835833]
Multi-hop QA requires the machine to answer complex questions through finding multiple clues and reasoning.
We propose Relation Extractor-Reader and Comparator (RERC), a three-stage framework based on complex question decomposition.
In the 2WikiMultiHopQA dataset, our RERC model has achieved the most advanced performance, with a winning joint F1 score of 53.58 on the leaderboard.
arXiv Detail & Related papers (2021-10-26T08:10:35Z) - Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval [117.07047313964773]
We propose a simple and efficient multi-hop dense retrieval approach for answering complex open-domain questions.
Our method does not require access to any corpus-specific information, such as inter-document hyperlinks or human-annotated entity markers.
Our system also yields a much better efficiency-accuracy trade-off, matching the best published accuracy on HotpotQA while being 10 times faster at inference time.
arXiv Detail & Related papers (2020-09-27T06:12:29Z) - Answering Any-hop Open-domain Questions with Iterative Document
Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions.
Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.