RAC: Retrieval-Augmented Clarification for Faithful Conversational Search
- URL: http://arxiv.org/abs/2601.11722v1
- Date: Fri, 16 Jan 2026 19:16:38 GMT
- Title: RAC: Retrieval-Augmented Clarification for Faithful Conversational Search
- Authors: Ahmed Rayane Kebir, Vincent Guigue, Lynda Said Lhadj, Laure Soulier,
- Abstract summary: We introduce RAC (Retrieval-Augmented Clarification), a framework for generating corpus-faithful clarification questions.<n>After comparing several indexing strategies for retrieval, we fine-tune a large language model to make optimal use of research context.<n>We then apply contrastive preference optimization to favor questions supported by retrieved passages over ungrounded alternatives.
- Score: 7.0486278653981245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Clarification questions help conversational search systems resolve ambiguous or underspecified user queries. While prior work has focused on fluency and alignment with user intent, especially through facet extraction, much less attention has been paid to grounding clarifications in the underlying corpus. Without such grounding, systems risk asking questions that cannot be answered from the available documents. We introduce RAC (Retrieval-Augmented Clarification), a framework for generating corpus-faithful clarification questions. After comparing several indexing strategies for retrieval, we fine-tune a large language model to make optimal use of research context and to encourage the generation of evidence-based question. We then apply contrastive preference optimization to favor questions supported by retrieved passages over ungrounded alternatives. Evaluated on four benchmarks, RAC demonstrate significant improvements over baselines. In addition to LLM-as-Judge assessments, we introduce novel metrics derived from NLI and data-to-text to assess how well questions are anchored in the context, and we demonstrate that our approach consistently enhances faithfulness.
Related papers
- When and What to Ask: AskBench and Rubric-Guided RLVR for LLM Clarification [8.391356566325054]
Large language models (LLMs) often respond even when prompts omit critical details or include misleading information.<n>We study how to evaluate and improve LLMs' ability to decide when and what to ask for clarification without sacrificing task performance.<n>We introduce AskBench, an interactive benchmark that converts standard QA pairs into multi-turn interactions with explicit checkpoints.
arXiv Detail & Related papers (2026-02-04T02:21:01Z) - ReAG: Reasoning-Augmented Generation for Knowledge-based Visual Question Answering [54.72902502486611]
ReAG is a Reasoning-Augmented Multimodal RAG approach that combines coarse- and fine-grained retrieval with a critic model that filters irrelevant passages.<n>ReAG significantly outperforms prior methods, improving answer accuracy and providing interpretable reasoning grounded in retrieved evidence.
arXiv Detail & Related papers (2025-11-27T19:01:02Z) - Towards Context-aware Reasoning-enhanced Generative Searching in E-commerce [61.03081096959132]
We propose a context-aware reasoning-enhanced generative search framework for better textbfunderstanding the complicated context.<n>Our approach achieves superior performance compared with strong baselines, validating its effectiveness for search-based recommendation.
arXiv Detail & Related papers (2025-10-19T16:46:11Z) - Learning to Detect Relevant Contexts and Knowledge for Response Selection in Retrieval-based Dialogue Systems [32.895603852919194]
We propose a multi-turn textbfResponse textbfSelection textbfModel that can textbfDetect the relevant parts of the textbfContext and textbfKnowledge collection.<n>Our model first uses the recent context as a query to pre-select relevant parts of the context and knowledge collection at the word-level and utterance-level semantics.
arXiv Detail & Related papers (2025-09-26T18:53:29Z) - Distilling a Small Utility-Based Passage Selector to Enhance Retrieval-Augmented Generation [110.610512800947]
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating retrieved information.<n>In RAG, the emphasis has shifted to utility, which considers the usefulness of passages for generating accurate answers.<n>Our approach focuses on utility-based selection rather than ranking, enabling dynamic passage selection tailored to specific queries without the need for fixed thresholds.<n>Our experiments demonstrate that utility-based selection provides a flexible and cost-effective solution for RAG, significantly reducing computational costs while improving answer quality.
arXiv Detail & Related papers (2025-07-25T09:32:29Z) - ClueAnchor: Clue-Anchored Knowledge Reasoning Exploration and Optimization for Retrieval-Augmented Generation [82.54090885503287]
Retrieval-Augmented Generation augments Large Language Models with external knowledge to improve factuality.<n>Existing RAG systems fail to extract and integrate the key clues needed to support faithful and interpretable reasoning.<n>We propose ClueAnchor, a novel framework for enhancing RAG via clue-anchored reasoning exploration and optimization.
arXiv Detail & Related papers (2025-05-30T09:18:08Z) - Improving RAG Retrieval via Propositional Content Extraction: a Speech Act Theory Approach [0.0]
This paper investigates whether extracting the underlying propositional content from user utterances can improve retrieval quality in Retrieval-Augmented Generation systems.<n>We propose a practical method for automatically transforming queries into their propositional equivalents before embedding.
arXiv Detail & Related papers (2025-03-07T20:15:40Z) - Adaptive Contrastive Decoding in Retrieval-Augmented Generation for Handling Noisy Contexts [24.5315425886482]
We propose adaptive contrastive decoding (ACD) to leverage contextual influence effectively.
ACD demonstrates improvements in open-domain question answering tasks compared to baselines.
arXiv Detail & Related papers (2024-08-02T08:03:38Z) - Dense X Retrieval: What Retrieval Granularity Should We Use? [56.90827473115201]
Often-overlooked design choice is the retrieval unit in which the corpus is indexed, e.g. document, passage, or sentence.
We introduce a novel retrieval unit, proposition, for dense retrieval.
Experiments reveal that indexing a corpus by fine-grained units such as propositions significantly outperforms passage-level units in retrieval tasks.
arXiv Detail & Related papers (2023-12-11T18:57:35Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z) - Query Focused Multi-Document Summarization with Distant Supervision [88.39032981994535]
Existing work relies heavily on retrieval-style methods for estimating the relevance between queries and text segments.
We propose a coarse-to-fine modeling framework which introduces separate modules for estimating whether segments are relevant to the query.
We demonstrate that our framework outperforms strong comparison systems on standard QFS benchmarks.
arXiv Detail & Related papers (2020-04-06T22:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.