Are cascade dialogue state tracking models speaking out of turn in
spoken dialogues?
- URL: http://arxiv.org/abs/2311.04922v1
- Date: Fri, 3 Nov 2023 08:45:22 GMT
- Title: Are cascade dialogue state tracking models speaking out of turn in
spoken dialogues?
- Authors: Lucas Druart (LIA), L\'eo Jacqmin (LIS), Beno\^it Favre (LIS), Lina
Maria Rojas-Barahona, Valentin Vielzeuf
- Abstract summary: This paper proposes a comprehensive analysis of the errors of state of the art systems in complex settings such as Dialogue State Tracking.
Based on spoken MultiWoz, we identify that errors on non-categorical slots' values are essential to address in order to bridge the gap between spoken and chat-based dialogue systems.
- Score: 1.786898113631979
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Task-Oriented Dialogue (TOD) systems, correctly updating the system's
understanding of the user's needs is key to a smooth interaction. Traditionally
TOD systems are composed of several modules that interact with one another.
While each of these components is the focus of active research communities,
their behavior in interaction can be overlooked. This paper proposes a
comprehensive analysis of the errors of state of the art systems in complex
settings such as Dialogue State Tracking which highly depends on the dialogue
context. Based on spoken MultiWoz, we identify that errors on non-categorical
slots' values are essential to address in order to bridge the gap between
spoken and chat-based dialogue systems. We explore potential solutions to
improve transcriptions and help dialogue state tracking generative models
correct such errors.
Related papers
- WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain.
These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech.
Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z) - Multimodal Contextual Dialogue Breakdown Detection for Conversational AI Models [1.4199474167684119]
We introduce a Multimodal Contextual Dialogue Breakdown (MultConDB) model.
This model significantly outperforms other known best models by achieving an F1 of 69.27.
arXiv Detail & Related papers (2024-04-11T23:09:18Z) - Adapting Text-based Dialogue State Tracker for Spoken Dialogues [20.139351605832665]
We describe our engineering effort in building a highly successful model that participated in the speech-aware dialogue systems technology challenge track in DSTC11.
Our model consists of three major modules: (1) automatic speech recognition error correction to bridge the gap between the spoken and the text utterances, (2) text-based dialogue system (D3ST) for estimating the slots and values using slot descriptions, and (3) post-processing for recovering the error of the estimated slot value.
arXiv Detail & Related papers (2023-08-29T06:27:58Z) - Act-Aware Slot-Value Predicting in Multi-Domain Dialogue State Tracking [5.816391291790977]
Dialogue state tracking (DST) aims to track human-machine interactions and generate state representations for managing the dialogue.
Recent advances in machine reading comprehension predict both categorical and non-categorical types of slots for dialogue state tracking.
We formulate and incorporate dialogue acts, and leverage recent advances in machine reading comprehension to predict both categorical and non-categorical types of slots for dialogue state tracking.
arXiv Detail & Related papers (2022-08-04T05:18:30Z) - A Chit-Chats Enhanced Task-Oriented Dialogue Corpora for Fuse-Motive
Conversation Systems [9.541995537438394]
We release a multi-turn dialogues dataset called CCET (Chinese Chat-Enhanced-Task)
We propose a line of fuse-motive dialogues formalization approach, along with several evaluation metrics for TOD sessions that are integrated by CC utterances.
arXiv Detail & Related papers (2022-05-12T05:43:18Z) - HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on
Tabular and Textual Data [87.67278915655712]
We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables.
The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions.
arXiv Detail & Related papers (2022-04-28T00:52:16Z) - UniDS: A Unified Dialogue System for Chit-Chat and Task-oriented
Dialogues [59.499965460525694]
We propose a unified dialogue system (UniDS) with the two aforementioned skills.
We design a unified dialogue data schema, compatible for both chit-chat and task-oriented dialogues.
We train UniDS with mixed dialogue data from a pretrained chit-chat dialogue model.
arXiv Detail & Related papers (2021-10-15T11:56:47Z) - Structural Modeling for Dialogue Disentanglement [43.352833140317486]
Multi-party dialogue context Tangled multi-party dialogue context leads to challenges for dialogue reading comprehension.
This work designs a novel model to disentangle multi-party history into threads, by taking dialogue structure features into account.
arXiv Detail & Related papers (2021-10-15T11:28:43Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.