Related papers: TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering

TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering

URL: http://arxiv.org/abs/2504.20114v2
Date: Wed, 30 Apr 2025 13:15:49 GMT
Title: TreeHop: Generate and Filter Next Query Embeddings Efficiently for Multi-hop Question Answering
Authors: Zhonghao Li, Kunpeng Zhang, Jinghuai Ou, Shuliang Liu, Xuming Hu,
Abstract summary: TreeHop is an embedding-level framework for multi-hop question answering.<n>TreeHop dynamically updates query embeddings by fusing semantic information from prior queries.<n>TreeHop is a faster and more cost-effective solution for deployment in a range of knowledge-intensive applications.
Score: 27.37434534716611
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Retrieval-augmented generation (RAG) systems face significant challenges in multi-hop question answering (MHQA), where complex queries require synthesizing information across multiple document chunks. Existing approaches typically rely on iterative LLM-based query rewriting and routing, resulting in high computational costs due to repeated LLM invocations and multi-stage processes. To address these limitations, we propose TreeHop, an embedding-level framework without the need for LLMs in query refinement. TreeHop dynamically updates query embeddings by fusing semantic information from prior queries and retrieved documents, enabling iterative retrieval through embedding-space operations alone. This method replaces the traditional "Retrieve-Rewrite-Vectorize-Retrieve" cycle with a streamlined "Retrieve-Embed-Retrieve" loop, significantly reducing computational overhead. Moreover, a rule-based stop criterion is introduced to further prune redundant retrievals, balancing efficiency and recall rate. Experimental results show that TreeHop rivals advanced RAG methods across three open-domain MHQA datasets, achieving comparable performance with only 5\%-0.4\% of the model parameter size and reducing the query latency by approximately 99\% compared to concurrent approaches. This makes TreeHop a faster and more cost-effective solution for deployment in a range of knowledge-intensive applications. For reproducibility purposes, codes and data are available here: https://github.com/allen-li1231/TreeHop-RAG.

Related papers

The benefits of query-based KGQA systems for complex and temporal questions in LLM era [55.20230501807337]
Large language models excel in question-answering (QA) yet still struggle with multi-hop reasoning and temporal questions.<n> Query-based knowledge graph QA (KGQA) offers a modular alternative by generating executable queries instead of direct answers.<n>We explore multi-stage query-based framework for WikiData QA, proposing multi-stage approach that enhances performance on challenging multi-hop and temporal benchmarks.
arXiv Detail & Related papers (2025-07-16T06:41:03Z)
Effective Instruction Parsing Plugin for Complex Logical Query Answering on Knowledge Graphs [51.33342412699939]
Knowledge Graph Query Embedding (KGQE) aims to embed First-Order Logic (FOL) queries in a low-dimensional KG space for complex reasoning over incomplete KGs. Recent studies integrate various external information (such as entity types and relation context) to better capture the logical semantics of FOL queries. We propose an effective Query Instruction Parsing (QIPP) that captures latent query patterns from code-like query instructions.
arXiv Detail & Related papers (2024-10-27T03:18:52Z)
Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism [2.919891871101241]
Transformers have a quadratic scaling of computational complexity with input size. Retrieval-augmented generation (RAG) can better handle longer contexts by using a retrieval system. We introduce a novel approach, Inner Loop Memory Augmented Tree Retrieval (ILM-TR)
arXiv Detail & Related papers (2024-10-11T19:49:05Z)
EfficientRAG: Efficient Retriever for Multi-Hop Question Answering [52.64500643247252]
We introduce EfficientRAG, an efficient retriever for multi-hop question answering. Experimental results demonstrate that EfficientRAG surpasses existing RAG methods on three open-domain multi-hop question-answering datasets.
arXiv Detail & Related papers (2024-08-08T06:57:49Z)
A Surprisingly Simple yet Effective Multi-Query Rewriting Method for Conversational Passage Retrieval [14.389703823471574]
We propose the use of a neural query rewriter to generate multiple queries and show how to integrate those queries in the passage retrieval pipeline efficiently. The main strength of our approach lies in its simplicity: it leverages how the beam search algorithm works and can produce multiple query rewrites at no additional cost.
arXiv Detail & Related papers (2024-06-27T07:43:03Z)
Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers [66.55612528039894]
AdaQR is a framework for training query rewriting models with limited rewrite annotations from seed datasets and completely no passage label. A novel approach is proposed to assess retriever's preference for these candidates by the probability of answers conditioned on the conversational query.
arXiv Detail & Related papers (2024-06-16T16:09:05Z)
Tree of Reviews: A Tree-based Dynamic Iterative Retrieval Framework for Multi-hop Question Answering [0.18849131083278733]
We propose a dynamic retrieval framework called Tree of Reviews (ToR) for multi-hop question answering. ToR achieves state-of-the-art performance in both retrieval and response generation.
arXiv Detail & Related papers (2024-04-22T09:25:05Z)
LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency [65.01402723259098]
We propose a novel method of query rewrite named LLM-R2, adopting a large language model (LLM) to propose possible rewrite rules for a database rewrite system. Experimental results have shown that our method can significantly improve the query execution efficiency and outperform the baseline methods.
arXiv Detail & Related papers (2024-04-19T13:17:07Z)
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversational Search [25.16282868262589]
RetPO is designed to optimize a language model (LM) for reformulating search queries in line with the preferences of the target retrieval systems. We construct a large-scale dataset called Retrievers' Feedback on over 410K query rewrites across 12K conversations. The resulting model achieves state-of-the-art performance on two recent conversational search benchmarks.
arXiv Detail & Related papers (2024-02-19T04:41:31Z)
Allies: Prompting Large Language Model with Beam Search [107.38790111856761]
In this work, we propose a novel method called ALLIES. Given an input query, ALLIES leverages LLMs to iteratively generate new queries related to the original query. By iteratively refining and expanding the scope of the original query, ALLIES captures and utilizes hidden knowledge that may not be directly through retrieval.
arXiv Detail & Related papers (2023-05-24T06:16:44Z)
Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline. This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z)
Answering Any-hop Open-domain Questions with Iterative Document Reranking [62.76025579681472]
We propose a unified QA framework to answer any-hop open-domain questions. Our method consistently achieves performance comparable to or better than the state-of-the-art on both single-hop and multi-hop open-domain QA datasets.
arXiv Detail & Related papers (2020-09-16T04:31:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.