Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search
- URL: http://arxiv.org/abs/2407.20189v1
- Date: Mon, 29 Jul 2024 17:14:36 GMT
- Title: Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search
- Authors: Fengran Mo, Chen Qu, Kelong Mao, Yihong Wu, Zhan Su, Kaiyu Huang, Jian-Yun Nie,
- Abstract summary: We leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model.
The proposed model -- Query Representation Alignment Conversational Retriever, QRACDR, is tested on eight datasets.
- Score: 32.35446999027349
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational search supports multi-turn user-system interactions to solve complex information needs. Different from the traditional single-turn ad-hoc search, conversational search encounters a more challenging problem of context-dependent query understanding with the lengthy and long-tail conversational history context. While conversational query rewriting methods leverage explicit rewritten queries to train a rewriting model to transform the context-dependent query into a stand-stone search query, this is usually done without considering the quality of search results. Conversational dense retrieval methods use fine-tuning to improve a pre-trained ad-hoc query encoder, but they are limited by the conversational search data available for training. In this paper, we leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model. The key idea is to align the query representation with those of rewritten queries and relevant documents. The proposed model -- Query Representation Alignment Conversational Dense Retriever, QRACDR, is tested on eight datasets, including various settings in conversational search and ad-hoc search. The results demonstrate the strong performance of QRACDR compared with state-of-the-art methods, and confirm the effectiveness of representation alignment.
Related papers
- Conversational Query Reformulation with the Guidance of Retrieved Documents [4.438698005789677]
We introduce GuideCQR, a framework that refines queries by leveraging key infFormation from the initially retrieved documents.
We show that GuideCQR can get additional performance gains in conversational search using various types of queries, even for queries written by humans.
arXiv Detail & Related papers (2024-07-17T07:39:16Z) - Query-oriented Data Augmentation for Session Search [71.84678750612754]
We propose query-oriented data augmentation to enrich search logs and empower the modeling.
We generate supplemental training pairs by altering the most important part of a search context.
We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty.
arXiv Detail & Related papers (2024-07-04T08:08:33Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - SSP: Self-Supervised Post-training for Conversational Search [63.28684982954115]
We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model.
To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
arXiv Detail & Related papers (2023-07-02T13:36:36Z) - Learning to Relate to Previous Turns in Conversational Search [26.931718474500652]
An effective way to improve retrieval effectiveness is to expand the current query with historical queries.
We propose a new method to select relevant historical queries that are useful for the current query.
arXiv Detail & Related papers (2023-06-05T03:00:10Z) - ConvGQR: Generative Query Reformulation for Conversational Search [37.54018632257896]
ConvGQR is a new framework to reformulate conversational queries based on generative pre-trained language models.
We propose a knowledge infusion mechanism to optimize both query reformulation and retrieval.
arXiv Detail & Related papers (2023-05-25T01:45:06Z) - Decoding a Neural Retriever's Latent Space for Query Suggestion [28.410064376447718]
We show that it is possible to decode a meaningful query from its latent representation and, when moving in the right direction in latent space, to decode a query that retrieves the relevant paragraph.
We employ the query decoder to generate a large synthetic dataset of query reformulations for MSMarco.
On this data, we train a pseudo-relevance feedback (PRF) T5 model for the application of query suggestion.
arXiv Detail & Related papers (2022-10-21T16:19:31Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z) - Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system.
We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting.
For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals.
For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.