MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections
and State Tracking Baselines
- URL: http://arxiv.org/abs/2007.12720v1
- Date: Fri, 10 Jul 2020 22:52:14 GMT
- Title: MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections
and State Tracking Baselines
- Authors: Xiaoxue Zang, Abhinav Rastogi, Srinivas Sunkara, Raghav Gupta, Jianguo
Zhang, Jindong Chen
- Abstract summary: This work introduces MultiWOZ 2.2, which is a yet another improved version of this dataset.
Firstly, we identify and fix dialogue state annotation errors across 17.3% of the utterances on top of MultiWOZ 2.1.
Secondly, we redefine the vocabularies of slots with a large number of possible values.
- Score: 15.540213987132839
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: MultiWOZ is a well-known task-oriented dialogue dataset containing over
10,000 annotated dialogues spanning 8 domains. It is extensively used as a
benchmark for dialogue state tracking. However, recent works have reported
presence of substantial noise in the dialogue state annotations. MultiWOZ 2.1
identified and fixed many of these erroneous annotations and user utterances,
resulting in an improved version of this dataset. This work introduces MultiWOZ
2.2, which is a yet another improved version of this dataset. Firstly, we
identify and fix dialogue state annotation errors across 17.3% of the
utterances on top of MultiWOZ 2.1. Secondly, we redefine the ontology by
disallowing vocabularies of slots with a large number of possible values (e.g.,
restaurant name, time of booking). In addition, we introduce slot span
annotations for these slots to standardize them across recent models, which
previously used custom string matching heuristics to generate them. We also
benchmark a few state of the art dialogue state tracking models on the
corrected dataset to facilitate comparison for future work. In the end, we
discuss best practices for dialogue data collection that can help avoid
annotation errors.
Related papers
- SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation [55.82577086422923]
We provide a feasible definition of dialogue segmentation points with the help of document-grounded dialogues.
We release a large-scale supervised dataset called SuperDialseg, containing 9,478 dialogues.
We also provide a benchmark including 18 models across five categories for the dialogue segmentation task.
arXiv Detail & Related papers (2023-05-15T06:08:01Z) - Weakly Supervised Data Augmentation Through Prompting for Dialogue
Understanding [103.94325597273316]
We present a novel approach that iterates on augmentation quality by applying weakly-supervised filters.
We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue.
For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.
arXiv Detail & Related papers (2022-10-25T17:01:30Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - Cross-Lingual Dialogue Dataset Creation via Outline-Based Generation [70.81596088969378]
Cross-lingual Outline-based Dialogue dataset (termed COD) enables natural language understanding.
COD enables dialogue state tracking, and end-to-end dialogue modelling and evaluation in 4 diverse languages.
arXiv Detail & Related papers (2022-01-31T18:11:21Z) - Contextual Semantic Parsing for Multilingual Task-Oriented Dialogues [7.8378818005171125]
Given a large-scale dialogue data set in one language, we can automatically produce an effective semantic for other languages using machine translation.
We propose automatic translation of dialogue datasets with alignment to ensure faithful translation of slot values.
We show that the succinct representation reduces the compounding effect of translation errors.
arXiv Detail & Related papers (2021-11-04T01:08:14Z) - Annotation Inconsistency and Entity Bias in MultiWOZ [40.127114829948965]
MultiWOZ is one of the most popular multi-domain task-oriented dialog datasets.
It has been widely accepted as a benchmark for various dialog tasks, e.g., dialog state tracking (DST), natural language generation (NLG), and end-to-end (E2E) dialog modeling.
arXiv Detail & Related papers (2021-05-29T00:09:06Z) - MultiWOZ 2.4: A Multi-Domain Task-Oriented Dialogue Dataset with
Essential Annotation Corrections to Improve State Tracking Evaluation [22.642643471824076]
This work introduces MultiWOZ 2.4, in which we refine all annotations in the validation set and test set on top of MultiWOZ 2.1.
The annotations in the training set remain unchanged to encourage robust and noise-resilient model training.
We further benchmark 8 state-of-the-art dialogue state tracking models.
arXiv Detail & Related papers (2021-04-01T21:31:48Z) - RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich
Semantic Annotations for Task-Oriented Dialogue Modeling [35.75880078666584]
RiSAWOZ is a large-scale multi-domain Chinese Wizard-of-Oz dataset with Rich Semantic s.
It contains 11.2K human-to-human (H2H) multi-turn semantically annotated dialogues, with more than 150K utterances spanning over 12 domains.
arXiv Detail & Related papers (2020-10-17T08:18:59Z) - MultiWOZ 2.3: A multi-domain task-oriented dialogue dataset enhanced
with annotation corrections and co-reference annotation [46.05021601314733]
Dialogue state annotations are error-prone, leading to sub-optimal performance.
We introduce MultiWOZ 2.3, in which we differentiate incorrect annotations in dialogue acts from dialogue states.
We implement co-reference features and unify annotations of dialogue acts and dialogue states.
arXiv Detail & Related papers (2020-10-12T10:53:19Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.