Mixed-initiative Query Rewriting in Conversational Passage Retrieval
- URL: http://arxiv.org/abs/2307.08803v3
- Date: Thu, 17 Oct 2024 19:09:13 GMT
- Title: Mixed-initiative Query Rewriting in Conversational Passage Retrieval
- Authors: Dayu Yang, Yue Zhang, Hui Fang,
- Abstract summary: We report our methods and experiments for the TREC Conversational Assistance Track (CAsT) 2022.
We propose a mixed-initiative query rewriting module, which achieves query rewriting based on the mixed-initiative interaction between the users and the system.
Experiments on both TREC CAsT 2021 and TREC CAsT 2022 datasets show the effectiveness of our mixed-initiative-based query rewriting (or query reformulation) method.
- Score: 11.644235288057123
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we report our methods and experiments for the TREC Conversational Assistance Track (CAsT) 2022. In this work, we aim to reproduce multi-stage retrieval pipelines and explore one of the potential benefits of involving mixed-initiative interaction in conversational passage retrieval scenarios: reformulating raw queries. Before the first ranking stage of a multi-stage retrieval pipeline, we propose a mixed-initiative query rewriting module, which achieves query rewriting based on the mixed-initiative interaction between the users and the system, as the replacement for the neural rewriting method. Specifically, we design an algorithm to generate appropriate questions related to the ambiguities in raw queries, and another algorithm to reformulate raw queries by parsing users' feedback and incorporating it into the raw query. For the first ranking stage of our multi-stage pipelines, we adopt a sparse ranking function: BM25, and a dense retrieval method: TCT-ColBERT. For the second-ranking step, we adopt a pointwise reranker: MonoT5, and a pairwise reranker: DuoT5. Experiments on both TREC CAsT 2021 and TREC CAsT 2022 datasets show the effectiveness of our mixed-initiative-based query rewriting (or query reformulation) method on improving retrieval performance compared with two popular reformulators: a neural reformulator: CANARD-T5 and a rule-based reformulator: historical query reformulator(HQE).
Related papers
- MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models [34.39053202801489]
In a real-world RAG system, the current query often involves spoken ellipses and ambiguous references from dialogue contexts.
We propose a novel query rewriting method MaFeRw, which improves RAG performance by integrating multi-aspect feedback from both the retrieval process and generated results.
Experimental results on two conversational RAG datasets demonstrate that MaFeRw achieves superior generation metrics and more stable training compared to baselines.
arXiv Detail & Related papers (2024-08-30T07:57:30Z) - AdaCQR: Enhancing Query Reformulation for Conversational Search via Sparse and Dense Retrieval Alignment [16.62505706601199]
We present a novel framework AdaCQR for conversational search reformulation.
By aligning reformulation models with both term-based and semantic-based retrieval systems, AdaCQR enhances the generalizability of information-seeking queries.
Experimental results on the TopiOCQA and QReCC datasets demonstrate that AdaCQR outperforms the existing methods in a more efficient framework.
arXiv Detail & Related papers (2024-07-02T05:50:16Z) - A Surprisingly Simple yet Effective Multi-Query Rewriting Method for Conversational Passage Retrieval [14.389703823471574]
We propose the use of a neural query rewriter to generate multiple queries and show how to integrate those queries in the passage retrieval pipeline efficiently.
The main strength of our approach lies in its simplicity: it leverages how the beam search algorithm works and can produce multiple query rewrites at no additional cost.
arXiv Detail & Related papers (2024-06-27T07:43:03Z) - Generative Query Reformulation Using Ensemble Prompting, Document Fusion, and Relevance Feedback [8.661419320202787]
GenQREnsemble and GenQRFusion leverage paraphrases of a zero-shot instruction to generate multiple sets of keywords to improve retrieval performance.
We demonstrate that an ensemble of query reformulations can improve retrieval effectiveness by up to 18% on nDCG@10 in pre-retrieval settings and 9% on post-retrieval settings.
arXiv Detail & Related papers (2024-05-27T21:03:26Z) - Selecting Query-bag as Pseudo Relevance Feedback for Information-seeking Conversations [76.70349332096693]
Information-seeking dialogue systems are widely used in e-commerce systems.
We propose a Query-bag based Pseudo Relevance Feedback framework (QB-PRF)
It constructs a query-bag with related queries to serve as pseudo signals to guide information-seeking conversations.
arXiv Detail & Related papers (2024-03-22T08:10:32Z) - SSP: Self-Supervised Post-training for Conversational Search [63.28684982954115]
We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model.
To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
arXiv Detail & Related papers (2023-07-02T13:36:36Z) - Incorporating Relevance Feedback for Information-Seeking Retrieval using
Few-Shot Document Re-Ranking [56.80065604034095]
We introduce a kNN approach that re-ranks documents based on their similarity with the query and the documents the user considers relevant.
To evaluate our different integration strategies, we transform four existing information retrieval datasets into the relevance feedback scenario.
arXiv Detail & Related papers (2022-10-19T16:19:37Z) - Leveraging Query Resolution and Reading Comprehension for Conversational
Passage Retrieval [6.490148466525755]
This paper describes the participation of UvA.ILPS group at the TREC CAsT 2020 track.
Our pipeline consists of (i) an initial retrieval module that uses BM25, and (ii) a re-ranking module that combines the score of a BERT ranking model with the score of a machine comprehension model adjusted for passage retrieval.
arXiv Detail & Related papers (2021-02-17T14:41:57Z) - A Comparison of Question Rewriting Methods for Conversational Passage
Retrieval [6.490148466525755]
Conversational passage retrieval relies on question rewriting to modify the original question so that it no longer depends on the conversation history.
Several methods for question rewriting have recently been proposed, but they were compared under different retrieval pipelines.
We bridge this gap by thoroughly evaluating those question rewriting methods on the TREC CAsT 2019 and 2020 datasets under the same retrieval pipeline.
arXiv Detail & Related papers (2021-01-19T00:17:52Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z) - Conversational Question Reformulation via Sequence-to-Sequence
Architectures and Pretrained Language Models [56.268862325167575]
This paper presents an empirical study of conversational question reformulation (CQR) with sequence-to-sequence architectures and pretrained language models (PLMs)
We leverage PLMs to address the strong token-to-token independence assumption made in the common objective, maximum likelihood estimation, for the CQR task.
We evaluate fine-tuned PLMs on the recently-introduced CANARD dataset as an in-domain task and validate the models using data from the TREC 2019 CAsT Track as an out-domain task.
arXiv Detail & Related papers (2020-04-04T11:07:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.