Integrating Dialog History into End-to-End Spoken Language Understanding
Systems
- URL: http://arxiv.org/abs/2108.08405v1
- Date: Wed, 18 Aug 2021 22:24:11 GMT
- Title: Integrating Dialog History into End-to-End Spoken Language Understanding
Systems
- Authors: Jatin Ganhotra, Samuel Thomas, Hong-Kwang J. Kuo, Sachindra Joshi,
George Saon, Zolt\'an T\"uske, Brian Kingsbury
- Abstract summary: We investigate the importance of dialog history and how it can be effectively integrated into end-to-end spoken language understanding systems.
While processing a spoken utterance, our proposed RNN transducer (RNN-T) based SLU model has access to its dialog history in the form of decoded transcripts and SLU labels of previous turns.
We evaluate our approach on a recently released spoken dialog data set, the HarperValleyBank corpus.
- Score: 37.08876551722831
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: End-to-end spoken language understanding (SLU) systems that process
human-human or human-computer interactions are often context independent and
process each turn of a conversation independently. Spoken conversations on the
other hand, are very much context dependent, and dialog history contains useful
information that can improve the processing of each conversational turn. In
this paper, we investigate the importance of dialog history and how it can be
effectively integrated into end-to-end SLU systems. While processing a spoken
utterance, our proposed RNN transducer (RNN-T) based SLU model has access to
its dialog history in the form of decoded transcripts and SLU labels of
previous turns. We encode the dialog history as BERT embeddings, and use them
as an additional input to the SLU model along with the speech features for the
current utterance. We evaluate our approach on a recently released spoken
dialog data set, the HarperValleyBank corpus. We observe significant
improvements: 8% for dialog action and 30% for caller intent recognition tasks,
in comparison to a competitive context independent end-to-end baseline system.
Related papers
- Are cascade dialogue state tracking models speaking out of turn in
spoken dialogues? [1.786898113631979]
This paper proposes a comprehensive analysis of the errors of state of the art systems in complex settings such as Dialogue State Tracking.
Based on spoken MultiWoz, we identify that errors on non-categorical slots' values are essential to address in order to bridge the gap between spoken and chat-based dialogue systems.
arXiv Detail & Related papers (2023-11-03T08:45:22Z) - Adapting Text-based Dialogue State Tracker for Spoken Dialogues [20.139351605832665]
We describe our engineering effort in building a highly successful model that participated in the speech-aware dialogue systems technology challenge track in DSTC11.
Our model consists of three major modules: (1) automatic speech recognition error correction to bridge the gap between the spoken and the text utterances, (2) text-based dialogue system (D3ST) for estimating the slots and values using slot descriptions, and (3) post-processing for recovering the error of the estimated slot value.
arXiv Detail & Related papers (2023-08-29T06:27:58Z) - Joint Modelling of Spoken Language Understanding Tasks with Integrated
Dialog History [30.20353302347147]
We propose a novel model architecture that learns dialog context to jointly predict the intent, dialog act, speaker role, and emotion for the spoken utterance.
Our experiments show that our joint model achieves similar results to task-specific classifiers.
arXiv Detail & Related papers (2023-05-01T16:26:18Z) - End-to-end Spoken Conversational Question Answering: Task, Dataset and
Model [92.18621726802726]
In spoken question answering, the systems are designed to answer questions from contiguous text spans within the related speech transcripts.
We propose a new Spoken Conversational Question Answering task (SCQA), aiming at enabling the systems to model complex dialogue flows.
Our main objective is to build the system to deal with conversational questions based on the audio recordings, and to explore the plausibility of providing more cues from different modalities with systems in information gathering.
arXiv Detail & Related papers (2022-04-29T17:56:59Z) - HybriDialogue: An Information-Seeking Dialogue Dataset Grounded on
Tabular and Textual Data [87.67278915655712]
We present a new dialogue dataset, HybriDialogue, which consists of crowdsourced natural conversations grounded on both Wikipedia text and tables.
The conversations are created through the decomposition of complex multihop questions into simple, realistic multiturn dialogue interactions.
arXiv Detail & Related papers (2022-04-28T00:52:16Z) - UniDS: A Unified Dialogue System for Chit-Chat and Task-oriented
Dialogues [59.499965460525694]
We propose a unified dialogue system (UniDS) with the two aforementioned skills.
We design a unified dialogue data schema, compatible for both chit-chat and task-oriented dialogues.
We train UniDS with mixed dialogue data from a pretrained chit-chat dialogue model.
arXiv Detail & Related papers (2021-10-15T11:56:47Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - A Context-Aware Hierarchical BERT Fusion Network for Multi-turn Dialog
Act Detection [6.361198391681688]
CaBERT-SLU is a context-aware hierarchical BERT fusion Network (CaBERT-SLU)
Our approach reaches new state-of-the-art (SOTA) performances in two complicated multi-turn dialogue datasets.
arXiv Detail & Related papers (2021-09-03T02:00:03Z) - Domain State Tracking for a Simplified Dialogue System [3.962145079528281]
We present DoTS, a task-oriented dialogue system that uses a simplified input context instead of the entire dialogue history.
DoTS improves the inform rate and success rate by 1.09 points and 1.24 points, respectively, compared to the previous state-of-the-art model on MultiWOZ.
arXiv Detail & Related papers (2021-03-11T13:00:54Z) - Rethinking Dialogue State Tracking with Reasoning [76.0991910623001]
This paper proposes to track dialogue states gradually with reasoning over dialogue turns with the help of the back-end data.
Empirical results demonstrate that our method significantly outperforms the state-of-the-art methods by 38.6% in terms of joint belief accuracy for MultiWOZ 2.1.
arXiv Detail & Related papers (2020-05-27T02:05:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.