A Transformer-based Response Evaluator for Open-Domain Spoken
Conversation
- URL: http://arxiv.org/abs/2302.04424v1
- Date: Thu, 9 Feb 2023 03:38:07 GMT
- Title: A Transformer-based Response Evaluator for Open-Domain Spoken
Conversation
- Authors: Vrindavan Harrison and Rishi Rajasekaran and Marilyn Walker
- Abstract summary: We study response selection in the Athena system, an Alexa Prize SocialBot.
We compare several off-the-shelf response ranking methods for open-domain dialogue.
We find that Athena-RR with a Recall@1 of 70.79% outperforms Athena-Heuristic and all of the off-the-shelf rankers by a large margin.
- Score: 1.0474108328884806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many open-domain dialogue systems rely on multiple response generators, any
of which can contribute a response to the dialogue in a particular context.
Thus the ability to compare potential responses and then select the best plays
an important role in ensuring a dialogue system is coherent and engaging.
Dialogue coherence goes beyond simply remaining on topic -- some trivia may be
on topic and engaging when mentioned out of the blue, but may not be coherent
and grounded in the context of the conversation. We carry out experiments on
response selection in the Athena system, an Alexa Prize SocialBot that has
dedicated content and multiple topic-specific response generators for a large
number of topics. First, we collect a corpus of Athena conversations with live
human traffic, where potential responses from all enabled response generators
are logged and subsequently annotated for response quality. We compare several
off-the-shelf response ranking methods for open-domain dialogue to
Athena-Heuristic, a heuristic response ranker that was field-tested in Athena
during the third Alexa Prize competition. We also compare these to a
transformer-based response ranker we call Athena-RR, that we train on our
Athena conversations. Athena-RR uses both the conversational context and the
dialogue state to rank the potential responses. We find that Athena-RR with a
Recall@1 of 70.79\% outperforms Athena-Heuristic and all of the off-the-shelf
rankers by a large margin. We then conduct a live A/B study comparing
Athena-Heuristic to Athena-RR in a 6,358 conversations with Alexa users. We
show that Athena-RR leads to significantly longer conversations that receive
significantly higher user ratings than the heuristic rule-based ranker.
Related papers
- PICK: Polished & Informed Candidate Scoring for Knowledge-Grounded
Dialogue Systems [59.1250765143521]
Current knowledge-grounded dialogue systems often fail to align the generated responses with human-preferred qualities.
We propose Polished & Informed Candidate Scoring (PICK), a generation re-scoring framework.
We demonstrate the effectiveness of PICK in generating responses that are more faithful while keeping them relevant to the dialogue history.
arXiv Detail & Related papers (2023-09-19T08:27:09Z) - Athena 2.0: Discourse and User Modeling in Open Domain Dialogue [5.434860847606497]
Athena 2.0 is a conversational agent for Amazon's Socialbot Grand Challenge 4.
It uses a knowledge-grounded discourse model to constrain named-entity recognition and linking, and coreference resolution.
It also relies on a user model to personalize topic selection and other aspects of the conversation to individual users.
arXiv Detail & Related papers (2023-08-03T17:30:39Z) - FCC: Fusing Conversation History and Candidate Provenance for Contextual
Response Ranking in Dialogue Systems [53.89014188309486]
We present a flexible neural framework that can integrate contextual information from multiple channels.
We evaluate our model on the MSDialog dataset widely used for evaluating conversational response ranking tasks.
arXiv Detail & Related papers (2023-03-31T23:58:28Z) - Let's Get Personal: Personal Questions Improve SocialBot Performance in
the Alexa Prize [0.0]
There has been an increased focus on creating conversational open-domain dialogue systems in the spoken dialogue community.
Unlike traditional dialogue systems, these conversational systems cannot assume any specific information need or domain restrictions.
We developed a robust open-domain conversational system, Athena, that real Amazon Echo users access and evaluate at scale.
arXiv Detail & Related papers (2023-03-09T00:10:29Z) - Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue
Systems [71.33737787564966]
End-to-end (E2E) task-oriented dialogue (ToD) systems are prone to fall into the so-called 'likelihood trap'
We propose a reranking method which aims to select high-quality items from the lists of responses initially overgenerated by the system.
Our methods improve a state-of-the-art E2E ToD system by 2.4 BLEU, 3.2 ROUGE, and 2.8 METEOR scores, achieving new peak results.
arXiv Detail & Related papers (2022-11-07T15:59:49Z) - ProsocialDialog: A Prosocial Backbone for Conversational Agents [104.92776607564583]
We introduce ProsocialDialog, the first large-scale dialogue dataset to teach conversational agents to respond to problematic content following social norms.
Created via a human-AI collaborative framework, ProsocialDialog consists of 58K dialogues, with 331K utterances, 160K RoTs, and 497K dialogue safety labels.
With this dataset, we introduce a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost.
arXiv Detail & Related papers (2022-05-25T11:48:47Z) - Athena 2.0: Contextualized Dialogue Management for an Alexa Prize
SocialBot [3.4000625471791577]
Athena 2.0 is an Alexa Prize SocialBot that has been a finalist in the last two Alexa Prize Grand Challenges.
Here we describe Athena's system design and performance in the 20/21 competition.
arXiv Detail & Related papers (2021-11-03T20:54:20Z) - Jurassic is (almost) All You Need: Few-Shot Meaning-to-Text Generation
for Open-Domain Dialogue [0.576178320759792]
We utilize Athena's response generators to create training data for two new neural Meaning-to-Text RGs.
We conduct few-shot experiments, both within and cross-domain, with different tuning set sizes.
We show that with 10-shot tuning, Athena-Jurassic's performance is significantly better for coherence and semantic accuracy.
arXiv Detail & Related papers (2021-10-15T13:42:25Z) - Building and Evaluating Open-Domain Dialogue Corpora with Clarifying
Questions [65.60888490988236]
We release a dataset focused on open-domain single- and multi-turn conversations.
We benchmark several state-of-the-art neural baselines.
We propose a pipeline consisting of offline and online steps for evaluating the quality of clarifying questions in various dialogues.
arXiv Detail & Related papers (2021-09-13T09:16:14Z) - Athena: Constructing Dialogues Dynamically with Discourse Constraints [11.008755264048522]
This report describes Athena, a dialogue system for spoken conversation on popular topics and current events.
We develop a flexible topic-agnostic approach to dialogue management that dynamically configures dialogue based on general principles of entity and topic coherence.
After describing the dialogue system architecture, we perform an analysis of conversations that Athena participated in during the 2019 Alexa Prize Competition.
arXiv Detail & Related papers (2020-11-21T00:28:34Z) - Towards Data Distillation for End-to-end Spoken Conversational Question
Answering [65.124088336738]
We propose a new Spoken Conversational Question Answering task (SCQA)
SCQA aims at enabling QA systems to model complex dialogues flow given the speech utterances and text corpora.
Our main objective is to build a QA system to deal with conversational questions both in spoken and text forms.
arXiv Detail & Related papers (2020-10-18T05:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.