ThinkQE: Query Expansion via an Evolving Thinking Process
- URL: http://arxiv.org/abs/2506.09260v1
- Date: Tue, 10 Jun 2025 21:41:01 GMT
- Title: ThinkQE: Query Expansion via an Evolving Thinking Process
- Authors: Yibin Lei, Tao Shen, Andrew Yates,
- Abstract summary: ThinkQE is a test-time query expansion framework that encourages deeper and comprehensive semantic exploration.<n>We show ThinkQE consistently outperforms prior approaches, including training-intensive dense retrievers and rerankers.
- Score: 18.170948352149292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective query expansion for web search benefits from promoting both exploration and result diversity to capture multiple interpretations and facets of a query. While recent LLM-based methods have improved retrieval performance and demonstrate strong domain generalization without additional training, they often generate narrowly focused expansions that overlook these desiderata. We propose ThinkQE, a test-time query expansion framework addressing this limitation through two key components: a thinking-based expansion process that encourages deeper and comprehensive semantic exploration, and a corpus-interaction strategy that iteratively refines expansions using retrieval feedback from the corpus. Experiments on diverse web search benchmarks (DL19, DL20, and BRIGHT) show ThinkQE consistently outperforms prior approaches, including training-intensive dense retrievers and rerankers.
Related papers
- Aligned Query Expansion: Efficient Query Expansion for Information Retrieval through LLM Alignment [4.21943400140261]
Aligned Query Expansion (AQE) is a novel approach to enhance query expansion for passage retrieval in open-domain question answering.<n>We show that AQE outperforms baseline models for query expansion in both in-domain and out-of-domain settings.
arXiv Detail & Related papers (2025-07-15T07:11:29Z) - Exp4Fuse: A Rank Fusion Framework for Enhanced Sparse Retrieval using Large Language Model-based Query Expansion [0.0]
Large Language Models (LLMs) have shown potential in generating hypothetical documents for query expansion.<n>We introduce a novel fusion ranking framework, Exp4Fuse, which enhances the performance of sparse retrievers.
arXiv Detail & Related papers (2025-06-05T08:44:34Z) - ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation [82.28147821286709]
We propose ClueAnchor, a novel framework for enhancing Retrieval-Augmented Generation (RAG)<n>ClueAnchor extracts key clues from retrieved content and generates multiple reasoning paths based on different knowledge configurations.<n>Experiments show that ClueAnchor significantly outperforms prior RAG baselines in reasoning completeness and robustness.
arXiv Detail & Related papers (2025-05-30T09:18:08Z) - Iterative Self-Incentivization Empowers Large Language Models as Agentic Searchers [74.17516978246152]
Large language models (LLMs) have been widely integrated into information retrieval to advance traditional techniques.<n>We propose EXSEARCH, an agentic search framework, where the LLM learns to retrieve useful information as the reasoning unfolds.<n>Experiments on four knowledge-intensive benchmarks show that EXSEARCH substantially outperforms baselines.
arXiv Detail & Related papers (2025-05-26T15:27:55Z) - LevelRAG: Enhancing Retrieval-Augmented Generation with Multi-hop Logic Planning over Rewriting Augmented Searchers [24.01783076521377]
Retrieval-Augmented Generation (RAG) is a crucial method for mitigating hallucinations in Large Language Models (LLMs)<n>Existing RAG methods typically employ query rewriting to clarify the user intent and manage multi-hop logic, while using hybrid retrieval to expand search scope.<n>We introduce a high-level searcher that decomposes complex queries into atomic queries, independent of any retriever-specific optimizations.<n>To harness the strengths of sparse retrievers for precise keyword retrieval, we have developed a new sparse searcher that employs Lucene syntax to enhance retrieval accuracy.
arXiv Detail & Related papers (2025-02-25T12:09:16Z) - Learning More Effective Representations for Dense Retrieval through Deliberate Thinking Before Search [65.53881294642451]
Deliberate Thinking based Dense Retriever (DEBATER)<n>DEBATER enhances recent dense retrievers by enabling them to learn more effective document representations through a step-by-step thinking process.<n> Experimental results show that DEBATER significantly outperforms existing methods across several retrieval benchmarks.
arXiv Detail & Related papers (2025-02-18T15:56:34Z) - QA-Expand: Multi-Question Answer Generation for Enhanced Query Expansion in Information Retrieval [12.095687580827065]
We introduce QA-Expand, a novel and effective framework for query expansion.<n>It first generates multiple relevant questions from the initial query and subsequently produces corresponding pseudo-answers as surrogate documents.<n>Extensive experiments on benchmarks such as BEIR and TREC demonstrate that QA-Expand enhances retrieval performance by up to 13% over state-of-the-art methods.
arXiv Detail & Related papers (2025-02-12T16:39:06Z) - Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach [56.610806615527885]
A key challenge in text-video retrieval (TVR) is the information asymmetry between video and text.<n>This paper introduces a data-centric framework to bridge this gap by enriching textual representations to better match the richness of video content.<n>We propose a query selection mechanism that identifies the most relevant and diverse queries, reducing computational cost while improving accuracy.
arXiv Detail & Related papers (2024-08-14T01:24:09Z) - IDEAL: Leveraging Infinite and Dynamic Characterizations of Large Language Models for Query-focused Summarization [59.06663981902496]
Query-focused summarization (QFS) aims to produce summaries that answer particular questions of interest, enabling greater user control and personalization.<n>We investigate two indispensable characteristics that the LLMs-based QFS models should be harnessed, Lengthy Document Summarization and Efficiently Fine-grained Query-LLM Alignment.<n>These innovations pave the way for broader application and accessibility in the field of QFS technology.
arXiv Detail & Related papers (2024-07-15T07:14:56Z) - Corpus-Steered Query Expansion with Large Language Models [35.64662397095323]
We introduce Corpus-Steered Query Expansion (CSQE) to promote the incorporation of knowledge embedded within the corpus.
CSQE utilizes the relevance assessing capability of LLMs to systematically identify pivotal sentences in the initially-retrieved documents.
Extensive experiments reveal that CSQE exhibits strong performance without necessitating any training, especially with queries for which LLMs lack knowledge.
arXiv Detail & Related papers (2024-02-28T03:58:58Z) - Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers? [72.42500059688396]
We show that it is possible to improve the generalization of a strong neural ranker, by prompt engineering and aggregating the ranking results of each expanded query via fusion.
Experiments on BEIR and TREC Deep Learning show that the nDCG@10 scores of both MonoT5 and RankT5 following these steps are improved.
arXiv Detail & Related papers (2023-11-15T18:11:41Z) - Modeling Uncertainty and Using Post-fusion as Fallback Improves Retrieval Augmented Generation with LLMs [80.74263278847063]
The integration of retrieved passages and large language models (LLMs) has significantly contributed to improving open-domain question answering.
This paper investigates different methods of combining retrieved passages with LLMs to enhance answer generation.
arXiv Detail & Related papers (2023-08-24T05:26:54Z) - A Linguistically Driven Framework for Query Expansion via Grammatical
Constituent Highlighting and Role-Based Concept Weighting [0.0]
Concepts-of-Interest are recognized as the core concepts that represent the gist of the search goal.
The remaining query constituents which serve to specify the search goal and complete the query structure are classified as descriptive, relational or structural.
arXiv Detail & Related papers (2020-04-25T01:43:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.