Related papers: ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision

ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision

URL: http://arxiv.org/abs/2505.21250v1
Date: Tue, 27 May 2025 14:28:24 GMT
Title: ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision
Authors: Dosung Lee, Wonjun Oh, Boyoung Kim, Minyoung Kim, Joonsuk Park, Paul Hongsuck Seo,
Abstract summary: Multi-hop question answering involves reasoning across multiple documents to answer complex questions.<n>Dense retrievers typically outperform sparse methods like BM25 by leveraging semantic embeddings.<n>ReSCORE is a novel method for training dense retrievers for MHQA without labeled documents.
Score: 23.80886911344813
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multi-hop question answering (MHQA) involves reasoning across multiple documents to answer complex questions. Dense retrievers typically outperform sparse methods like BM25 by leveraging semantic embeddings; however, they require labeled query-document pairs for fine-tuning. This poses a significant challenge in MHQA due to the high variability of queries (reformulated) questions throughout the reasoning steps. To overcome this limitation, we introduce Retriever Supervision with Consistency and Relevance (ReSCORE), a novel method for training dense retrievers for MHQA without labeled documents. ReSCORE leverages large language models to capture each documents relevance to the question and consistency with the correct answer and use them to train a retriever within an iterative question-answering framework. Experiments on three MHQA benchmarks demonstrate the effectiveness of ReSCORE, with significant improvements in retrieval, and in turn, the state-of-the-art MHQA performance. Our implementation is available at: https://leeds1219.github.io/ReSCORE.

Related papers

Question Decomposition for Retrieval-Augmented Generation [2.6409776648054764]
We propose a RAG pipeline that incorporates question decomposition into sub-questions.<n>We show that question decomposition effectively assembles complementary documents, while reranking reduces noise.<n>Although reranking itself is standard, we show that pairing an off-the-shelf cross-encoder reranker with LLM-driven question decomposition bridges the retrieval gap on multi-hop questions.
arXiv Detail & Related papers (2025-07-01T01:01:54Z)
Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search [65.53881294642451]
Deliberate Thinking based Dense Retriever (DEBATER)<n>DEBATER enhances recent dense retrievers by enabling them to learn more effective document representations through a step-by-step thinking process.<n> Experimental results show that DEBATER significantly outperforms existing methods across several retrieval benchmarks.
arXiv Detail & Related papers (2025-02-18T15:56:34Z)
EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.64500643247252]
We introduce EfficientRAG, an efficient retriever for multi-hop question answering. Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets.
arXiv Detail & Related papers (2024-08-08T06:57:49Z)
Conversational Query Reformulation with the Guidance of Retrieved Documents [4.438698005789677]
We introduce GuideCQR, a framework that refines queries by leveraging key information from the initially retrieved documents.<n>Our proposed method achieves state-of-the-art performance across multiple datasets, outperforming previous CQR methods.
arXiv Detail & Related papers (2024-07-17T07:39:16Z)
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval [54.54576644403115]
We introduce BRIGHT, the first text retrieval benchmark that requires intensive reasoning to retrieve relevant documents.<n>Our dataset consists of 1,384 real-world queries spanning diverse domains, such as economics, psychology, mathematics, and coding.<n>We show that incorporating explicit reasoning about the query improves retrieval performance by up to 12.2 points.
arXiv Detail & Related papers (2024-07-16T17:58:27Z)
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation [23.74712435991676]
RetPO is designed to optimize a language model for reformulating search queries in line with the preferences of the target retrieval systems.<n>We construct a large-scale dataset called Retrievers' Feedback on over 410K query rewrites across 12K conversations.<n>Our resulting model demonstrates superiority on two benchmarks, surpassing the previous state-of-the-art performance of rewrite-then-retrieve approaches.
arXiv Detail & Related papers (2024-02-19T04:41:31Z)
AugTriever: Unsupervised Dense Retrieval and Domain Adaptation by Scalable Data Augmentation [44.93777271276723]
We propose two approaches that enable annotation-free and scalable training by creating pseudo querydocument pairs. The query extraction method involves selecting salient spans from the original document to generate pseudo queries. The transferred query generation method utilizes generation models trained for other NLP tasks, such as summarization, to produce pseudo queries.
arXiv Detail & Related papers (2022-12-17T10:43:25Z)
UniKGQA: Unified Retrieval and Reasoning for Solving Multi-hop Question Answering Over Knowledge Graph [89.98762327725112]
Multi-hop Question Answering over Knowledge Graph(KGQA) aims to find the answer entities that are multiple hops away from the topic entities mentioned in a natural language question. We propose UniKGQA, a novel approach for multi-hop KGQA task, by unifying retrieval and reasoning in both model architecture and parameter learning.
arXiv Detail & Related papers (2022-12-02T04:08:09Z)
CONQRR: Conversational Query Rewriting for Retrieval with Reinforcement Learning [16.470428531658232]
We develop a query rewriting model CONQRR that rewrites a conversational question in context into a standalone question. We show that CONQRR achieves state-of-the-art results on a recent open-domain CQA dataset.
arXiv Detail & Related papers (2021-12-16T01:40:30Z)
Weakly Supervised Pre-Training for Multi-Hop Retriever [23.79574380039197]
We propose a new method for weakly supervised multi-hop retriever pre-training without human efforts. Our method includes 1) a pre-training task for generating vector representations of complex questions, 2) a scalable data generation method that produces the nested structure of question and sub-question as weak supervision for pre-training, and 3) a pre-training model structure based on dense encoders.
arXiv Detail & Related papers (2021-06-18T08:06:02Z)
Open Question Answering over Tables and Text [55.8412170633547]
In open question answering (QA), the answer to a question is produced by retrieving and then analyzing documents that might contain answers to the question. Most open QA systems have considered only retrieving information from unstructured text. We present a new large-scale dataset Open Table-and-Text Question Answering (OTT-QA) to evaluate performance on this task.
arXiv Detail & Related papers (2020-10-20T16:48:14Z)
Answering Any-hop Open-domain Questions with Iterative Document Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions. Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.