Can Synthetic Query Rewrites Capture User Intent Better than Humans in Retrieval-Augmented Generation?
- URL: http://arxiv.org/abs/2509.22325v1
- Date: Fri, 26 Sep 2025 13:23:01 GMT
- Title: Can Synthetic Query Rewrites Capture User Intent Better than Humans in Retrieval-Augmented Generation?
- Authors: JiaYing Zheng, HaiNan Zhang, Liang Pang, YongXin Tong, ZhiMing Zheng,
- Abstract summary: Multi-turn RAG systems often face queries with colloquial omissions and ambiguous references.<n>Due to limitations in annotators' expressive ability and depth of understanding, manually rewritten queries often diverge from those needed in real-world RAG systems.<n>We propose SynRewrite, a synthetic data-driven query rewriting model to generate high-quality synthetic rewrites more aligned with user intent.
- Score: 32.75334667566984
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-turn RAG systems often face queries with colloquial omissions and ambiguous references, posing significant challenges for effective retrieval and generation. Traditional query rewriting relies on human annotators to clarify queries, but due to limitations in annotators' expressive ability and depth of understanding, manually rewritten queries often diverge from those needed in real-world RAG systems, resulting in a gap between user intent and system response. We observe that high-quality synthetic queries can better bridge this gap, achieving superior performance in both retrieval and generation compared to human rewrites. This raises an interesting question: Can rewriting models trained on synthetic queries better capture user intent than human annotators? In this paper, we propose SynRewrite, a synthetic data-driven query rewriting model to generate high-quality synthetic rewrites more aligned with user intent. To construct training data, we prompt GPT-4o with dialogue history, current queries, positive documents, and answers to synthesize high-quality rewrites. A Flan-T5 model is then finetuned on this dataset to map dialogue history and queries to synthetic rewrites. Finally, we further enhance the rewriter using the generator's feedback through the DPO algorithm to boost end-task performance. Experiments on TopiOCQA and QRECC datasets show that SynRewrite consistently outperforms human rewrites in both retrieval and generation tasks. Our results demonstrate that synthetic rewrites can serve as a scalable and effective alternative to human annotations.
Related papers
- R-Bot: An LLM-based Query Rewrite System [20.909806427953264]
We propose R-Bot, an LLM-based query rewrite system with a systematic approach.<n>We first design a multi-source rewrite evidence preparation pipeline to generate query rewrite evidences.<n>We then propose a hybrid-semantics retrieval method that combines structural and semantic analysis.<n>We conduct comprehensive experiments on real-world datasets and widely used benchmarks, and demonstrate the superior performance of our system.
arXiv Detail & Related papers (2024-12-02T16:13:04Z) - Understanding Synthetic Context Extension via Retrieval Heads [51.8869530817334]
We investigate fine-tuning on synthetic data for three long-context tasks that require retrieval and reasoning.<n>We find that models trained on synthetic data fall short of the real data, but surprisingly, the mismatch can be interpreted.<n>Our results shed light on how to interpret synthetic data fine-tuning performance and how to approach creating better data for learning real-world capabilities over long contexts.
arXiv Detail & Related papers (2024-10-29T17:55:00Z) - Multi-Document Grounded Multi-Turn Synthetic Dialog Generation [22.7158929225259]
We introduce a technique for multi-document grounded multi-turn synthetic dialog generation that incorporates three main ideas.
We control the overall dialog flow using taxonomy-driven user queries that are generated with Chain-of-Thought prompting.
We support the generation of multi-document grounded dialogs by mimicking real-world use of retrievers to update the grounding documents after every user-turn in the dialog.
arXiv Detail & Related papers (2024-09-17T19:02:39Z) - MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models [22.50450558103786]
In a real-world RAG system, the current query often involves spoken ellipses and ambiguous references from dialogue contexts.<n>We propose a novel query rewriting method MaFeRw, which improves RAG performance by integrating multi-aspect feedback from both the retrieval process and generated results.<n> Experimental results on two conversational RAG datasets demonstrate that MaFeRw achieves superior generation metrics and more stable training compared to baselines.
arXiv Detail & Related papers (2024-08-30T07:57:30Z) - Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers [66.55612528039894]
AdaQR is a framework for training query rewriting models with limited rewrite annotations from seed datasets and completely no passage label.
A novel approach is proposed to assess retriever's preference for these candidates by the probability of answers conditioned on the conversational query.
arXiv Detail & Related papers (2024-06-16T16:09:05Z) - RaFe: Ranking Feedback Improves Query Rewriting for RAG [83.24385658573198]
We propose a framework for training query rewriting models free of annotations.
By leveraging a publicly available reranker, oursprovides feedback aligned well with the rewriting objectives.
arXiv Detail & Related papers (2024-05-23T11:00:19Z) - SynthesizRR: Generating Diverse Datasets with Retrieval Augmentation [55.2480439325792]
We study the synthesis of six datasets, covering topic classification, sentiment analysis, tone detection, and humor.
We find that SynthesizRR greatly improves lexical and semantic diversity, similarity to human-written text, and distillation performance.
arXiv Detail & Related papers (2024-05-16T12:22:41Z) - LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency [65.01402723259098]
We propose a novel method of query rewrite named LLM-R2, adopting a large language model (LLM) to propose possible rewrite rules for a database rewrite system.
Experimental results have shown that our method can significantly improve the query execution efficiency and outperform the baseline methods.
arXiv Detail & Related papers (2024-04-19T13:17:07Z) - Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation [23.74712435991676]
RetPO is designed to optimize a language model for reformulating search queries in line with the preferences of the target retrieval systems.<n>We construct a large-scale dataset called Retrievers' Feedback on over 410K query rewrites across 12K conversations.<n>Our resulting model demonstrates superiority on two benchmarks, surpassing the previous state-of-the-art performance of rewrite-then-retrieve approaches.
arXiv Detail & Related papers (2024-02-19T04:41:31Z) - Enhancing Conversational Search: Large Language Model-Aided Informative
Query Rewriting [42.35788605017555]
We propose utilizing large language models (LLMs) as query rewriters.
We define four essential properties for well-formed rewrites and incorporate all of them into the instruction.
We introduce the role of rewrite editors for LLMs when initial query rewrites are available, forming a "rewrite-then-edit" process.
arXiv Detail & Related papers (2023-10-15T03:04:17Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - Noise-Robust Dense Retrieval via Contrastive Alignment Post Training [89.29256833403167]
Contrastive Alignment POst Training (CAPOT) is a highly efficient finetuning method that improves model robustness without requiring index regeneration.
CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root.
We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.
arXiv Detail & Related papers (2023-04-06T22:16:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.