Learning Contextual Retrieval for Robust Conversational Search
- URL: http://arxiv.org/abs/2509.19700v1
- Date: Wed, 24 Sep 2025 02:17:37 GMT
- Title: Learning Contextual Retrieval for Robust Conversational Search
- Authors: Seunghan Yang, Juntae Lee, Jihwan Bang, Kyuhong Shim, Minsoo Kim, Simyung Chang,
- Abstract summary: ContextualRetriever is a novel LLM-based retriever that directly incorporates conversational context into the retrieval process.<n>Our approach introduces: (1) a context-aware embedding mechanism that highlights the current query within the dialogue history; (2) intent-guided supervision based on high-quality rewritten queries; and (3) a training strategy that preserves the generative capabilities of the base LLM.
- Score: 34.74877456870482
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Effective conversational search demands a deep understanding of user intent across multiple dialogue turns. Users frequently use abbreviations and shift topics in the middle of conversations, posing challenges for conventional retrievers. While query rewriting techniques improve clarity, they often incur significant computational cost due to additional autoregressive steps. Moreover, although LLM-based retrievers demonstrate strong performance, they are not explicitly optimized to track user intent in multi-turn settings, often failing under topic drift or contextual ambiguity. To address these limitations, we propose ContextualRetriever, a novel LLM-based retriever that directly incorporates conversational context into the retrieval process. Our approach introduces: (1) a context-aware embedding mechanism that highlights the current query within the dialogue history; (2) intent-guided supervision based on high-quality rewritten queries; and (3) a training strategy that preserves the generative capabilities of the base LLM. Extensive evaluations across multiple conversational search benchmarks demonstrate that ContextualRetriever significantly outperforms existing methods while incurring no additional inference overhead.
Related papers
- Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning [66.52010873968383]
We introduce a conversational agent that interleaves search and reasoning across turns, enabling exploratory and adaptive behaviors learned through reinforcement learning (RL) training.<n>The experimental results across four widely used conversational benchmarks demonstrate the effectiveness of our methods.
arXiv Detail & Related papers (2026-01-19T14:55:54Z) - Reasoning-enhanced Query Understanding through Decomposition and Interpretation [130.19204432111277]
ReDI is a Reasoning-enhanced approach for query understanding through Decomposition and Interpretation.<n>We compiled a large-scale dataset of real-world complex queries from a major search engine.<n>Experiments on BRIGHT and BEIR demonstrate that ReDI consistently surpasses strong baselines in both sparse and dense retrieval paradigms.
arXiv Detail & Related papers (2025-09-08T10:58:42Z) - Contextualizing Search Queries In-Context Learning for Conversational Rewriting with LLMs [0.0]
This paper introduces Prompt-Guided In-Context Learning, a novel approach for few-shot conversational query rewriting.<n>Our method employs carefully designed prompts, incorporating task descriptions, input/output format specifications, and a small set of illustrative examples.<n>Experiments on benchmark datasets, TREC and Taskmaster-1, demonstrate that our approach significantly outperforms strong baselines.
arXiv Detail & Related papers (2025-02-20T20:02:42Z) - IRLab@iKAT24: Learned Sparse Retrieval with Multi-aspect LLM Query Generation for Conversational Search [6.974395116689502]
iKAT 2024 focuses on advancing conversational assistants, able to adapt their interaction and responses from personalized user knowledge.
The track incorporates a Personal Textual Knowledge Base (PTKB) alongside Conversational AI tasks, such as passage ranking and response generation.
arXiv Detail & Related papers (2024-11-22T05:18:35Z) - ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval [37.24069808198862]
Conversational search requires accurate interpretation of user intent from complex multi-turn contexts.
This paper presents ChatRetriever, which inherits the strong generalization capability of large language models to robustly represent conversational sessions for dense retrieval.
arXiv Detail & Related papers (2024-04-21T07:03:55Z) - Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - ZeQR: Zero-shot Query Reformulation for Conversational Search [11.644235288057123]
We introduce a novel Zero-shot Query Reformulation (or Query Rewriting) framework that reformulates queries based on previous dialogue contexts without requiring supervision from conversational search data.
Specifically, our framework utilizes language models designed for machine reading comprehension tasks to explicitly resolve two common ambiguities: coreference and omission, in raw queries.
It also provides greater explainability and effectively enhances query intent understanding because ambiguities are explicitly and proactively resolved.
arXiv Detail & Related papers (2023-07-18T16:05:25Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - Cue-CoT: Chain-of-thought Prompting for Responding to In-depth Dialogue
Questions with LLMs [59.74002011562726]
We propose a novel linguistic cue-based chain-of-thoughts (textitCue-CoT) to provide a more personalized and engaging response.
We build a benchmark with in-depth dialogue questions, consisting of 6 datasets in both Chinese and English.
Empirical results demonstrate our proposed textitCue-CoT method outperforms standard prompting methods in terms of both textithelpfulness and textitacceptability on all datasets.
arXiv Detail & Related papers (2023-05-19T16:27:43Z) - Synergistic Interplay between Search and Large Language Models for
Information Retrieval [141.18083677333848]
InteR allows RMs to expand knowledge in queries using LLM-generated knowledge collections.
InteR achieves overall superior zero-shot retrieval performance compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-12T11:58:15Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.