The Interplay of Task Success and Dialogue Quality: An in-depth
Evaluation in Task-Oriented Visual Dialogues
- URL: http://arxiv.org/abs/2103.11151v1
- Date: Sat, 20 Mar 2021 10:13:30 GMT
- Title: The Interplay of Task Success and Dialogue Quality: An in-depth
Evaluation in Task-Oriented Visual Dialogues
- Authors: Alberto Testoni, Raffaella Bernardi
- Abstract summary: We show that in the popular end-to-end approach, this choice prevents the model from learning to generate linguistically richer dialogues.
We show that in GuessWhat, models could increase their accuracy if they learn to ground, encode, and decode also words that do not occur frequently in the training set.
- Score: 6.02280861819024
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: When training a model on referential dialogue guessing games, the best model
is usually chosen based on its task success. We show that in the popular
end-to-end approach, this choice prevents the model from learning to generate
linguistically richer dialogues, since the acquisition of language proficiency
takes longer than learning the guessing task. By comparing models playing
different games (GuessWhat, GuessWhich, and Mutual Friends), we show that this
discrepancy is model- and task-agnostic. We investigate whether and when better
language quality could lead to higher task success. We show that in GuessWhat,
models could increase their accuracy if they learn to ground, encode, and
decode also words that do not occur frequently in the training set.
Related papers
- Don't Copy the Teacher: Data and Model Challenges in Embodied Dialogue [92.01165203498299]
Embodied dialogue instruction following requires an agent to complete a complex sequence of tasks from a natural language exchange.
This paper argues that imitation learning (IL) and related low-level metrics are actually misleading and do not align with the goals of embodied dialogue research.
arXiv Detail & Related papers (2022-10-10T05:51:40Z) - DialogZoo: Large-Scale Dialog-Oriented Task Learning [52.18193690394549]
We aim to build a unified foundation model which can solve massive diverse dialogue tasks.
To achieve this goal, we first collect a large-scale well-labeled dialogue dataset from 73 publicly available datasets.
arXiv Detail & Related papers (2022-05-25T11:17:16Z) - Context-Aware Language Modeling for Goal-Oriented Dialogue Systems [84.65707332816353]
We formulate goal-oriented dialogue as a partially observed Markov decision process.
We derive a simple and effective method to finetune language models in a goal-aware way.
We evaluate our method on a practical flight-booking task using AirDialogue.
arXiv Detail & Related papers (2022-04-18T17:23:11Z) - Knowledge Injection into Dialogue Generation via Language Models [85.65843021510521]
InjK is a two-stage approach to inject knowledge into a dialogue generation model.
First, we train a large-scale language model and query it as textual knowledge.
Second, we frame a dialogue generation model to sequentially generate textual knowledge and a corresponding response.
arXiv Detail & Related papers (2020-04-30T07:31:24Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z) - Modality-Balanced Models for Visual Dialogue [102.35406085738325]
The Visual Dialog task requires a model to exploit both image and conversational context information to generate the next response to the dialogue.
We show that previous joint-modality (history and image) models over-rely on and are more prone to memorizing the dialogue history.
We present methods for this integration of the two models, via ensemble and consensus dropout fusion with shared parameters.
arXiv Detail & Related papers (2020-01-17T14:57:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.