Sparse and Dense Approaches for the Full-rank Retrieval of Responses for
Dialogues
- URL: http://arxiv.org/abs/2204.10558v1
- Date: Fri, 22 Apr 2022 08:15:15 GMT
- Title: Sparse and Dense Approaches for the Full-rank Retrieval of Responses for
Dialogues
- Authors: Gustavo Penha and Claudia Hauff
- Abstract summary: We focus on the more realistic task of full-rank retrieval of responses, where $n$ can be up to millions of responses.
Our findings based on three different information-seeking dialogue datasets reveal that a learned response expansion technique is a solid baseline for sparse retrieval.
We find the best performing method overall to be dense retrieval with intermediate training, followed by fine-tuning on the target conversational data.
- Score: 11.726528038065764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ranking responses for a given dialogue context is a popular benchmark in
which the setup is to re-rank the ground-truth response over a limited set of
$n$ responses, where $n$ is typically 10. The predominance of this setup in
conversation response ranking has lead to a great deal of attention to building
neural re-rankers, while the first-stage retrieval step has been overlooked.
Since the correct answer is always available in the candidate list of $n$
responses, this artificial evaluation setup assumes that there is a first-stage
retrieval step which is always able to rank the correct response in its top-$n$
list. In this paper we focus on the more realistic task of full-rank retrieval
of responses, where $n$ can be up to millions of responses. We investigate both
dialogue context and response expansion techniques for sparse retrieval, as
well as zero-shot and fine-tuned dense retrieval approaches. Our findings based
on three different information-seeking dialogue datasets reveal that a learned
response expansion technique is a solid baseline for sparse retrieval. We find
the best performing method overall to be dense retrieval with intermediate
training, i.e. a step after the language model pre-training where sentence
representations are learned, followed by fine-tuning on the target
conversational data. We also investigate the intriguing phenomena that harder
negatives sampling techniques lead to worse results for the fine-tuned dense
retrieval models. The code and datasets are available at
https://github.com/Guzpenha/transformer_rankers/tree/full_rank_retrieval_dialogues.
Related papers
- Effective and Efficient Conversation Retrieval for Dialogue State Tracking with Implicit Text Summaries [48.243879779374836]
Few-shot dialogue state tracking (DST) with Large Language Models (LLM) relies on an effective and efficient conversation retriever to find similar in-context examples for prompt learning.
Previous works use raw dialogue context as search keys and queries, and a retriever is fine-tuned with annotated dialogues to achieve superior performance.
We handle the task of conversation retrieval based on text summaries of the conversations.
A LLM-based conversation summarizer is adopted for query and key generation, which enables effective maximum inner product search.
arXiv Detail & Related papers (2024-02-20T14:31:17Z) - SSP: Self-Supervised Post-training for Conversational Search [63.28684982954115]
We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model.
To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
arXiv Detail & Related papers (2023-07-02T13:36:36Z) - Phrase Retrieval for Open-Domain Conversational Question Answering with
Conversational Dependency Modeling via Contrastive Learning [54.55643652781891]
Open-Domain Conversational Question Answering (ODConvQA) aims at answering questions through a multi-turn conversation.
We propose a method to directly predict answers with a phrase retrieval scheme for a sequence of words.
arXiv Detail & Related papers (2023-06-07T09:46:38Z) - Re$^3$Dial: Retrieve, Reorganize and Rescale Dialogue Corpus for
Long-Turn Open-Domain Dialogue Pre-training [90.3412708846419]
Most dialogues in existing pre-training corpora contain fewer than three turns of dialogue.
We propose the Retrieve, Reorganize and Rescale framework (Re$3$Dial) to automatically construct billion-scale long-turn dialogues.
By repeating the above process, Re$3$Dial can yield a coherent long-turn dialogue.
arXiv Detail & Related papers (2023-05-04T07:28:23Z) - Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue
Systems [71.33737787564966]
End-to-end (E2E) task-oriented dialogue (ToD) systems are prone to fall into the so-called 'likelihood trap'
We propose a reranking method which aims to select high-quality items from the lists of responses initially overgenerated by the system.
Our methods improve a state-of-the-art E2E ToD system by 2.4 BLEU, 3.2 ROUGE, and 2.8 METEOR scores, achieving new peak results.
arXiv Detail & Related papers (2022-11-07T15:59:49Z) - A Systematic Evaluation of Response Selection for Open Domain Dialogue [36.88551817451512]
We curated a dataset where responses from multiple response generators produced for the same dialog context are manually annotated as appropriate (positive) and inappropriate (negative)
We conduct a systematic evaluation of state-of-the-art methods for response selection, and demonstrate that both strategies of using multiple positive candidates and using manually verified hard negative candidates can bring in significant performance improvement in comparison to using the adversarial training data, e.g., increase of 3% and 13% in Recall@1 score, respectively.
arXiv Detail & Related papers (2022-08-08T19:33:30Z) - Exploring Dense Retrieval for Dialogue Response Selection [42.89426092886912]
We present a solution to directly select proper responses from a large corpus or even a nonparallel corpus, using a dense retrieval model.
For re-rank setting, the superiority is quite surprising given its simplicity. For full-rank setting, we can emphasize that we are the first to do such evaluation.
arXiv Detail & Related papers (2021-10-13T10:10:32Z) - Generating Dialogue Responses from a Semantic Latent Space [75.18449428414736]
We propose an alternative to the end-to-end classification on vocabulary.
We learn the pair relationship between the prompts and responses as a regression task on a latent space.
Human evaluation showed that learning the task on a continuous space can generate responses that are both relevant and informative.
arXiv Detail & Related papers (2020-10-04T19:06:16Z) - Dialogue Response Ranking Training with Large-Scale Human Feedback Data [52.12342165926226]
We leverage social media feedback data to build a large-scale training dataset for feedback prediction.
We trained DialogRPT, a set of GPT-2 based models on 133M pairs of human feedback data.
Our ranker outperforms the conventional dialog perplexity baseline with a large margin on predicting Reddit feedback.
arXiv Detail & Related papers (2020-09-15T10:50:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.