AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
- URL: http://arxiv.org/abs/2407.01965v1
- Date: Tue, 2 Jul 2024 05:50:16 GMT
- Title: AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment
- Authors: Yilong Lai, Jialong Wu, Congzhi Zhang, Haowen Sun, Deyu Zhou,
- Abstract summary: We present a novel framework AdaCQR for conversational search reformulation.
By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries.
Experimental evaluations on the TopiOCQA and QReCC datasets demonstrate that AdaCQR significantly outperforms existing methods.
- Score: 16.62505706601199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational Query Reformulation (CQR) has significantly advanced in addressing the challenges of conversational search, particularly those stemming from the latent user intent and the need for historical context. Recent works aimed to boost the performance of CRQ through alignment. However, they are designed for one specific retrieval system, which potentially results in poor generalization. To overcome this limitation, we present a novel framework AdaCQR. By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries across diverse retrieval environments through a dual-phase training strategy. We also developed two effective approaches for acquiring superior labels and diverse input candidates, boosting the efficiency and robustness of the framework. Experimental evaluations on the TopiOCQA and QReCC datasets demonstrate that AdaCQR significantly outperforms existing methods, offering both quantitative and qualitative improvements in conversational query reformulation.
Related papers
- CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation [68.81271028921647]
We introduce CORAL, a benchmark designed to assess RAG systems in realistic multi-turn conversational settings.
CORAL includes diverse information-seeking conversations automatically derived from Wikipedia.
It supports three core tasks of conversational RAG: passage retrieval, response generation, and citation labeling.
arXiv Detail & Related papers (2024-10-30T15:06:32Z) - GenCRF: Generative Clustering and Reformulation Framework for Enhanced Intent-Driven Information Retrieval [20.807374287510623]
We propose GenCRF: a Generative Clustering and Reformulation Framework to capture diverse intentions adaptively.
We show that GenCRF achieves state-of-the-art performance, surpassing previous query reformulation SOTAs by up to 12% on nDCG@10.
arXiv Detail & Related papers (2024-09-17T05:59:32Z) - KaPQA: Knowledge-Augmented Product Question-Answering [59.096607961704656]
We introduce two product question-answering (QA) datasets focused on Adobe Acrobat and Photoshop products.
We also propose a novel knowledge-driven RAG-QA framework to enhance the performance of the models in the product QA task.
arXiv Detail & Related papers (2024-07-22T22:14:56Z) - Think-then-Act: A Dual-Angle Evaluated Retrieval-Augmented Generation [3.2134014920850364]
Large language models (LLMs) often face challenges such as temporal misalignment and generating hallucinatory content.
We propose a dual-angle evaluated retrieval-augmented generation framework textitThink-then-Act'
arXiv Detail & Related papers (2024-06-18T20:51:34Z) - Unified Active Retrieval for Retrieval Augmented Generation [69.63003043712696]
In Retrieval-Augmented Generation (RAG), retrieval is not always helpful and applying it to every instruction is sub-optimal.
Existing active retrieval methods face two challenges: 1.
They usually rely on a single criterion, which struggles with handling various types of instructions.
They depend on specialized and highly differentiated procedures, and thus combining them makes the RAG system more complicated.
arXiv Detail & Related papers (2024-06-18T12:09:02Z) - Generative Query Reformulation Using Ensemble Prompting, Document Fusion, and Relevance Feedback [8.661419320202787]
GenQREnsemble and GenQRFusion leverage paraphrases of a zero-shot instruction to generate multiple sets of keywords to improve retrieval performance.
We demonstrate that an ensemble of query reformulations can improve retrieval effectiveness by up to 18% on nDCG@10 in pre-retrieval settings and 9% on post-retrieval settings.
arXiv Detail & Related papers (2024-05-27T21:03:26Z) - Query Performance Prediction: From Ad-hoc to Conversational Search [55.37199498369387]
Query performance prediction (QPP) is a core task in information retrieval.
Research has shown the effectiveness and usefulness of QPP for ad-hoc search.
Despite its potential, QPP for conversational search has been little studied.
arXiv Detail & Related papers (2023-05-18T12:37:01Z) - Better Retrieval May Not Lead to Better Question Answering [59.1892787017522]
A popular approach to improve the system's performance is to improve the quality of the retrieved context from the IR stage.
We show that for StrategyQA, a challenging open-domain QA dataset that requires multi-hop reasoning, this common approach is surprisingly ineffective.
arXiv Detail & Related papers (2022-05-07T16:59:38Z) - Conversational Query Rewriting with Self-supervised Learning [36.392717968127016]
Conversational Query Rewriting (CQR) aims to simplify the multi-turn dialogue modeling into a single-turn problem by explicitly rewriting the conversational query into a self-contained utterance.
Existing approaches rely on massive supervised training data, which is labor-intensive to annotate.
We propose to construct a large-scale CQR dataset automatically via self-supervised learning, which does not need human annotation.
arXiv Detail & Related papers (2021-02-09T08:57:53Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.