Efficient Conversational Search via Topical Locality in Dense Retrieval
- URL: http://arxiv.org/abs/2504.21507v1
- Date: Wed, 30 Apr 2025 10:56:34 GMT
- Title: Efficient Conversational Search via Topical Locality in Dense Retrieval
- Authors: Cristina Ioana Muntean, Franco Maria Nardini, Raffaele Perego, Guido Rocchietti, Cosimo Rulli,
- Abstract summary: We exploit the topical locality inherent in conversational queries to improve response time.<n>By leveraging query embedding similarities, we dynamically restrict the search space to semantically relevant document clusters.<n>Our results show that the proposed system effectively handles complex, multiturn queries with high precision and efficiency.
- Score: 9.38751103209178
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pre-trained language models have been widely exploited to learn dense representations of documents and queries for information retrieval. While previous efforts have primarily focused on improving effectiveness and user satisfaction, response time remains a critical bottleneck of conversational search systems. To address this, we exploit the topical locality inherent in conversational queries, i.e., the tendency of queries within a conversation to focus on related topics. By leveraging query embedding similarities, we dynamically restrict the search space to semantically relevant document clusters, reducing computational complexity without compromising retrieval quality. We evaluate our approach on the TREC CAsT 2019 and 2020 datasets using multiple embedding models and vector indexes, achieving improvements in processing speed of up to 10.4X with little loss in performance (4.4X without any loss). Our results show that the proposed system effectively handles complex, multiturn queries with high precision and efficiency, offering a practical solution for real-time conversational search.
Related papers
- FastLane: Efficient Routed Systems for Late-Interaction Retrieval [58.060096779432094]
FastLane is a novel retrieval framework that dynamically routes queries to their most informative representations.<n>By bridging late-interaction models with Approximate Nearest Neighbor Search (ANNS), FastLane enables scalable, low-latency retrieval.
arXiv Detail & Related papers (2026-01-10T02:22:01Z) - Over-Searching in Search-Augmented Large Language Models [22.821710825732563]
Search-augmented large language models (LLMs) excel at knowledge-intensive tasks by integrating external retrieval.<n>Over-searching leads to computational inefficiency and hallucinations by incorporating irrelevant context.<n>Our finding shows: (i) search generally improves answer accuracy on answerable queries but harms abstention on unanswerable ones; (ii) over-searching is more pronounced in complex reasoning models and deep research systems; and (iii) the composition of retrieved evidence is crucial, as the presence of negative evidence improves abstention.
arXiv Detail & Related papers (2026-01-09T03:24:46Z) - Query Decomposition for RAG: Balancing Exploration-Exploitation [83.79639293409802]
RAG systems address complex user requests by decomposing them into subqueries, retrieving potentially relevant documents for each, and then aggregating them to generate an answer.<n>We formulate query decomposition and document retrieval in an exploitation-exploration setting, where retrieving one document at a time builds a belief about the utility of a given sub-queries.<n>Our main finding is that estimating document relevance using rank information and human judgments yields a 35% gain in document-level precision, 15% increase in alpha-nDCG, and better performance on the downstream task of long-form generation.
arXiv Detail & Related papers (2025-10-21T13:37:11Z) - DIVER: A Multi-Stage Approach for Reasoning-intensive Information Retrieval [36.38599923075882]
DIVER is a retrieval pipeline designed for reasoning-intensive information retrieval.<n>It consists of four components: the document preprocessing stage, the query expansion stage, the retrieval stage and the reranking stage.<n>On the BRIGHT benchmark, DIVER achieves state-of-the-art nDCG@10 scores of 45.8 overall and 28.9 on original queries, consistently outperforming competitive reasoning-aware models.
arXiv Detail & Related papers (2025-08-11T13:57:49Z) - Benchmarking Deep Search over Heterogeneous Enterprise Data [73.55304268238474]
We present a new benchmark for evaluating a form of retrieval-augmented generation (RAG)<n>RAG requires source-aware, multi-hop reasoning over diverse, sparsed, but related sources.<n>We build it using a synthetic data pipeline that simulates business across product planning, development, and support stages.
arXiv Detail & Related papers (2025-06-29T08:34:59Z) - Dense Passage Retrieval in Conversational Search [0.0]
We present a new method called dense retrieval, which uses a dual-encoder to create contextual embeddings that can be indexed and clustered efficiently at run-time.<n>We propose an end-to-end conversational search system called GPT2QR+DPR, which incorporates various query reformulation strategies to improve retrieval accuracy.<n>Our work contributes to the growing body of research on neural-based retrieval methods in conversational search, and highlights the potential of dense retrieval in improving retrieval accuracy in conversational search systems.
arXiv Detail & Related papers (2025-03-21T19:39:31Z) - Efficient Long Context Language Model Retrieval with Compression [57.09163579304332]
Long Context Language Models (LCLMs) have emerged as a new paradigm to perform Information Retrieval (IR)<n>We propose a new compression approach tailored for LCLM retrieval, which is trained to maximize the retrieval performance while minimizing the length of the compressed passages.<n>We show that CoLoR improves the retrieval performance by 6% while compressing the in-context size by a factor of 1.91.
arXiv Detail & Related papers (2024-12-24T07:30:55Z) - Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search [32.35446999027349]
We leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model.
The proposed model -- Query Representation Alignment Conversational Retriever, QRACDR, is tested on eight datasets.
arXiv Detail & Related papers (2024-07-29T17:14:36Z) - Thread: A Logic-Based Data Organization Paradigm for How-To Question Answering with Retrieval Augmented Generation [49.36436704082436]
How-to questions are integral to decision-making processes and require dynamic, step-by-step answers.
We propose Thread, a novel data organization paradigm aimed at enabling current systems to handle how-to questions more effectively.
arXiv Detail & Related papers (2024-06-19T09:14:41Z) - ChatRetriever: Adapting Large Language Models for Generalized and Robust Conversational Dense Retrieval [37.24069808198862]
Conversational search requires accurate interpretation of user intent from complex multi-turn contexts.
This paper presents ChatRetriever, which inherits the strong generalization capability of large language models to robustly represent conversational sessions for dense retrieval.
arXiv Detail & Related papers (2024-04-21T07:03:55Z) - ConvSDG: Session Data Generation for Conversational Search [29.211860955861244]
We propose a framework to explore the feasibility of boosting conversational search by using large language models (LLMs) for session data generation.
Within this framework, we design dialogue/session-level and query-level data generation with unsupervised and semi-supervised learning.
The generated data are used to fine-tune the conversational dense retriever.
arXiv Detail & Related papers (2024-03-17T20:34:40Z) - LIST: Learning to Index Spatio-Textual Data for Embedding based Spatial Keyword Queries [53.843367588870585]
List K-kNN spatial keyword queries (TkQs) return a list of objects based on a ranking function that considers both spatial and textual relevance.
There are two key challenges in building an effective and efficient index, i.e., the absence of high-quality labels and the unbalanced results.
We develop a novel pseudolabel generation technique to address the two challenges.
arXiv Detail & Related papers (2024-03-12T05:32:33Z) - Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.