Related papers: SSP: Self-Supervised Post-training for Conversational Search

SSP: Self-Supervised Post-training for Conversational Search

URL: http://arxiv.org/abs/2307.00569v1
Date: Sun, 2 Jul 2023 13:36:36 GMT
Title: SSP: Self-Supervised Post-training for Conversational Search
Authors: Quan Tu, Shen Gao, Xiaolong Wu, Zhao Cao, Ji-Rong Wen and Rui Yan
Abstract summary: We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model. To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
Score: 63.28684982954115
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conversational search has been regarded as the next-generation search paradigm. Constrained by data scarcity, most existing methods distill the well-trained ad-hoc retriever to the conversational retriever. However, these methods, which usually initialize parameters by query reformulation to discover contextualized dependency, have trouble in understanding the dialogue structure information and struggle with contextual semantic vanishing. In this paper, we propose \fullmodel (\model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model to enhance the dialogue structure and contextual semantic understanding. Furthermore, the \model can be plugged into most of the existing conversational models to boost their performance. To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by \model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20. Extensive experiments that our \model can boost the performance of several existing conversational search methods. Our source code is available at \url{https://github.com/morecry/SSP}.

Related papers

HEISIR: Hierarchical Expansion of Inverted Semantic Indexing for Training-free Retrieval of Conversational Data using LLMs [0.3277163122167434]
This paper introduces HEISIR, a novel framework that enhances semantic understanding in conversational data retrieval. Heisir implements a two-step process: (1) Hierarchical Triplets Formulation and (2) Adjunct Augmentation, creating semantic indices consisting of Subject-Verb-Object-Adjunct (SVOA) quadruplets. Our experimental results demonstrate that HEISIR outperforms fine-tuned models across various embedding types and language models.
arXiv Detail & Related papers (2025-03-06T06:39:25Z)
Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach [33.231639257323536]
In this paper, we address the issue of dialogue-form context query within the interactive text-to-image retrieval task. By reformulating the dialogue-form context, we eliminate the necessity of fine-tuning a retrieval model on existing visual dialogue data. We construct the LLM questioner to generate non-redundant questions about the attributes of the target image.
arXiv Detail & Related papers (2024-06-05T16:09:01Z)
Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning. Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance. We handle the task of conversation retrieval based on text summaries of the conversations. A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z)
CONVERSER: Few-Shot Conversational Dense Retrieval with Synthetic Data Generation [32.10366004426449]
We propose CONVERSER, a framework for training conversational dense retrievers with at most 6 examples of in-domain dialogues. We utilize the in-context learning capability of large language models to generate conversational queries given a passage in the retrieval corpus. Experimental results on conversational retrieval benchmarks OR-QuAC and TREC CAsT 19 show that the proposed CONVERSER achieves comparable performance to fully-supervised models.
arXiv Detail & Related papers (2023-09-13T06:40:24Z)
GRASP: Guiding model with RelAtional Semantics using Prompt [3.1275060062551208]
We propose a Guiding model with RelAtional Semantics using Prompt (GRASP) We adopt a prompt-based fine-tuning approach and capture relational semantic clues of a given dialogue with an argument-aware prompt marker strategy. In the experiments, GRASP state-of-the-art performance in terms of both F1 and F1c scores on a DialogRE dataset.
arXiv Detail & Related papers (2022-08-26T08:19:28Z)
KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains. We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z)
DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for Dialog Response Generation [80.45816053153722]
DialogVED introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses. We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation.
arXiv Detail & Related papers (2022-04-27T16:18:15Z)
In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST) A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates. This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z)
Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting [56.268862325167575]
We tackle conversational passage retrieval (ConvPR) with query reformulation integrated into a multi-stage ad-hoc IR system. We propose two conversational query reformulation (CQR) methods: (1) term importance estimation and (2) neural query rewriting. For the former, we expand conversational queries using important terms extracted from the conversational context with frequency-based signals. For the latter, we reformulate conversational queries into natural, standalone, human-understandable queries with a pretrained sequence-tosequence model.
arXiv Detail & Related papers (2020-05-05T14:30:20Z)
Conversations with Search Engines: SERP-based Conversational Response Generation [77.1381159789032]
We create a suitable dataset, the Search as a Conversation (SaaC) dataset, for the development of pipelines for conversations with search engines. We also develop a state-of-the-art pipeline for conversations with search engines, the Conversations with Search Engines (CaSE) using this dataset. CaSE enhances the state-of-the-art by introducing a supporting token identification module and aprior-aware pointer generator.
arXiv Detail & Related papers (2020-04-29T13:07:53Z)
MLR: A Two-stage Conversational Query Rewriting Model with Multi-task Learning [16.88648782206587]
We propose the conversational query rewriting model - MLR, which is a Multi-task model on sequence Labeling and query Rewriting. MLR reformulates the multi-turn conversational queries into a single turn query, which conveys the true intention of users concisely. To train our model, we construct a new Chinese query rewriting dataset and conduct experiments on it.
arXiv Detail & Related papers (2020-04-13T08:04:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.