Learning to Relate to Previous Turns in Conversational Search
- URL: http://arxiv.org/abs/2306.02553v1
- Date: Mon, 5 Jun 2023 03:00:10 GMT
- Title: Learning to Relate to Previous Turns in Conversational Search
- Authors: Fengran Mo, Jian-Yun Nie, Kaiyu Huang, Kelong Mao, Yutao Zhu, Peng Li,
Yang Liu
- Abstract summary: An effective way to improve retrieval effectiveness is to expand the current query with historical queries.
We propose a new method to select relevant historical queries that are useful for the current query.
- Score: 26.931718474500652
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conversational search allows a user to interact with a search system in
multiple turns. A query is strongly dependent on the conversation context. An
effective way to improve retrieval effectiveness is to expand the current query
with historical queries. However, not all the previous queries are related to,
and useful for expanding the current query. In this paper, we propose a new
method to select relevant historical queries that are useful for the current
query. To cope with the lack of labeled training data, we use a pseudo-labeling
approach to annotate useful historical queries based on their impact on the
retrieval results. The pseudo-labeled data are used to train a selection model.
We further propose a multi-task learning framework to jointly train the
selector and the retriever during fine-tuning, allowing us to mitigate the
possible inconsistency between the pseudo labels and the changed retriever.
Extensive experiments on four conversational search datasets demonstrate the
effectiveness and broad applicability of our method compared with several
strong baselines.
Related papers
- Aligning Query Representation with Rewritten Query and Relevance Judgments in Conversational Search [32.35446999027349]
We leverage both rewritten queries and relevance judgments in the conversational search data to train a better query representation model.
The proposed model -- Query Representation Alignment Conversational Retriever, QRACDR, is tested on eight datasets.
arXiv Detail & Related papers (2024-07-29T17:14:36Z) - Query-oriented Data Augmentation for Session Search [71.84678750612754]
We propose query-oriented data augmentation to enrich search logs and empower the modeling.
We generate supplemental training pairs by altering the most important part of a search context.
We develop several strategies to alter the current query, resulting in new training data with varying degrees of difficulty.
arXiv Detail & Related papers (2024-07-04T08:08:33Z) - A Surprisingly Simple yet Effective Multi-Query Rewriting Method for Conversational Passage Retrieval [14.389703823471574]
We propose the use of a neural query rewriter to generate multiple queries and show how to integrate those queries in the passage retrieval pipeline efficiently.
The main strength of our approach lies in its simplicity: it leverages how the beam search algorithm works and can produce multiple query rewrites at no additional cost.
arXiv Detail & Related papers (2024-06-27T07:43:03Z) - Database-Augmented Query Representation for Information Retrieval [59.57065228857247]
We present a novel retrieval framework called Database-Augmented Query representation (DAQu)
DAQu augments the original query with various (query-related) metadata across multiple tables.
We validate DAQu in diverse retrieval scenarios that can incorporate metadata from the relational database.
arXiv Detail & Related papers (2024-06-23T05:02:21Z) - Toward Conversational Agents with Context and Time Sensitive Long-term Memory [8.085414868117917]
Until recently, most work on RAG has focused on information retrieval from large databases of texts, like Wikipedia.
We argue that effective retrieval from long-form conversational data faces two unique problems compared to static database retrieval.
We generate a new dataset of ambiguous and time-based questions that build upon a recent dataset of long-form, simulated conversations.
arXiv Detail & Related papers (2024-05-29T18:19:46Z) - Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - ConvGQR: Generative Query Reformulation for Conversational Search [37.54018632257896]
ConvGQR is a new framework to reformulate conversational queries based on generative pre-trained language models.
We propose a knowledge infusion mechanism to optimize both query reformulation and retrieval.
arXiv Detail & Related papers (2023-05-25T01:45:06Z) - Query Rewriting for Retrieval-Augmented Large Language Models [139.242907155883]
Large Language Models (LLMs) play powerful, black-box readers in the retrieve-then-read pipeline.
This work introduces a new framework, Rewrite-Retrieve-Read instead of the previous retrieve-then-read for the retrieval-augmented LLMs.
arXiv Detail & Related papers (2023-05-23T17:27:50Z) - CAPSTONE: Curriculum Sampling for Dense Retrieval with Document
Expansion [68.19934563919192]
We propose a curriculum sampling strategy that utilizes pseudo queries during training and progressively enhances the relevance between the generated query and the real query.
Experimental results on both in-domain and out-of-domain datasets demonstrate that our approach outperforms previous dense retrieval models.
arXiv Detail & Related papers (2022-12-18T15:57:46Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.