Automatic Evaluation and Moderation of Open-domain Dialogue Systems
- URL: http://arxiv.org/abs/2111.02110v1
- Date: Wed, 3 Nov 2021 10:08:05 GMT
- Title: Automatic Evaluation and Moderation of Open-domain Dialogue Systems
- Authors: Zhang Chen and Jo\~ao Sadoc and Luis Fernando D'Haro and Rafael Banchs
and Alexander Rudnicky
- Abstract summary: A long standing challenge that bothers the researchers is the lack of effective automatic evaluation metrics.
This paper describes the data, baselines and results obtained for the Track 5 at the Dialogue System Technology Challenge 10 (DSTC10)
- Score: 59.305712262126264
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent years, dialogue systems have attracted significant interests in
both academia and industry. Especially the discipline of open-domain dialogue
systems, aka chatbots, has gained great momentum. Yet, a long standing
challenge that bothers the researchers is the lack of effective automatic
evaluation metrics, which results in significant impediment in the current
research. Common practice in assessing the performance of open-domain dialogue
models involves extensive human evaluation on the final deployed models, which
is both time- and cost- intensive. Moreover, a recent trend in building
open-domain chatbots involve pre-training dialogue models with a large amount
of social media conversation data. However, the information contained in the
social media conversations may be offensive and inappropriate. Indiscriminate
usage of such data can result in insensitive and toxic generative models. This
paper describes the data, baselines and results obtained for the Track 5 at the
Dialogue System Technology Challenge 10 (DSTC10).
Related papers
- Data Augmentation for Conversational AI [17.48107304359591]
Data augmentation (DA) is an affective approach to alleviate the data scarcity problem in conversational systems.
This tutorial provides a comprehensive and up-to-date overview of DA approaches in the context of conversational systems.
arXiv Detail & Related papers (2023-09-09T09:56:35Z) - Overview of Robust and Multilingual Automatic Evaluation Metrics for
Open-Domain Dialogue Systems at DSTC 11 Track 4 [51.142614461563184]
This track in the 11th Dialogue System Technology Challenge (DSTC11) is part of the ongoing effort to promote robust and multilingual automatic evaluation metrics.
This article describes the datasets and baselines provided to participants and discusses the submission and result details of the two proposed subtasks.
arXiv Detail & Related papers (2023-06-22T10:50:23Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Building a Role Specified Open-Domain Dialogue System Leveraging
Large-Scale Language Models [15.062014096238803]
We study the challenge of imposing roles on open-domain dialogue systems.
We propose an efficient data collection framework for building role-satisfying dialogue dataset from scratch.
Our models return few out-of-bounds utterances, keeping competitive performance on general metrics.
arXiv Detail & Related papers (2022-04-30T06:23:06Z) - EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with
Large-Scale Pre-Training [73.98154158068134]
EVA2.0 is a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters.
We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available.
arXiv Detail & Related papers (2022-03-17T13:33:17Z) - A Review of Dialogue Systems: From Trained Monkeys to Stochastic Parrots [0.0]
We aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans.
We present a broad overview of methods developed to build dialogue systems over the years.
arXiv Detail & Related papers (2021-11-02T08:07:55Z) - Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired
Data [61.71319905364992]
We propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data.
A data-level distillation process is first proposed to construct augmented dialogues where both post and response are retrieved from the unpaired data.
A ranking module is employed to filter out low-quality dialogues.
A model-level distillation process is employed to distill a teacher model trained on high-quality paired data to augmented dialogue pairs.
arXiv Detail & Related papers (2020-09-20T13:06:38Z) - Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances.
We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.