Speaker Turn Modeling for Dialogue Act Classification
- URL: http://arxiv.org/abs/2109.05056v1
- Date: Fri, 10 Sep 2021 18:36:35 GMT
- Title: Speaker Turn Modeling for Dialogue Act Classification
- Authors: Zihao He, Leili Tavabi, Kristina Lerman, Mohammad Soleymani
- Abstract summary: We propose to integrate the turn changes in conversations among speakers when modeling Dialogue Act (DA) classification.
We learn conversation-invariant speaker turn embeddings to represent the speaker turns in a conversation.
Our model is able to capture the semantics from the dialogue content while accounting for different speaker turns in a conversation.
- Score: 9.124489616470001
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dialogue Act (DA) classification is the task of classifying utterances with
respect to the function they serve in a dialogue. Existing approaches to DA
classification model utterances without incorporating the turn changes among
speakers throughout the dialogue, therefore treating it no different than
non-interactive written text. In this paper, we propose to integrate the turn
changes in conversations among speakers when modeling DAs. Specifically, we
learn conversation-invariant speaker turn embeddings to represent the speaker
turns in a conversation; the learned speaker turn embeddings are then merged
with the utterance embeddings for the downstream task of DA classification.
With this simple yet effective mechanism, our model is able to capture the
semantics from the dialogue content while accounting for different speaker
turns in a conversation. Validation on three benchmark public datasets
demonstrates superior performance of our model.
Related papers
- WavChat: A Survey of Spoken Dialogue Models [66.82775211793547]
Recent advancements in spoken dialogue models, exemplified by systems like GPT-4o, have captured significant attention in the speech domain.
These advanced spoken dialogue models not only comprehend audio, music, and other speech-related features, but also capture stylistic and timbral characteristics in speech.
Despite the progress in spoken dialogue systems, there is a lack of comprehensive surveys that systematically organize and analyze these systems.
arXiv Detail & Related papers (2024-11-15T04:16:45Z) - Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z) - Revisiting Conversation Discourse for Dialogue Disentanglement [88.3386821205896]
We propose enhancing dialogue disentanglement by taking full advantage of the dialogue discourse characteristics.
We develop a structure-aware framework to integrate the rich structural features for better modeling the conversational semantic context.
Our work has great potential to facilitate broader multi-party multi-thread dialogue applications.
arXiv Detail & Related papers (2023-06-06T19:17:47Z) - Exploring Speaker-Related Information in Spoken Language Understanding
for Better Speaker Diarization [7.673971221635779]
We propose methods to extract speaker-related information from semantic content in multi-party meetings.
Experiments on both AISHELL-4 and AliMeeting datasets show that our method achieves consistent improvements over acoustic-only speaker diarization systems.
arXiv Detail & Related papers (2023-05-22T11:14:19Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - Enhanced Speaker-aware Multi-party Multi-turn Dialogue Comprehension [43.352833140317486]
Multi-party multi-turn dialogue comprehension brings unprecedented challenges.
Most existing methods deal with dialogue contexts as plain texts.
We propose an enhanced speaker-aware model with masking attention and heterogeneous graph networks.
arXiv Detail & Related papers (2021-09-09T07:12:22Z) - Content-Aware Speaker Embeddings for Speaker Diarisation [3.6398652091809987]
The content-aware speaker embeddings (CASE) approach is proposed.
Case factorises automatic speech recognition (ASR) from speaker recognition to focus on modelling speaker characteristics.
Case achieved a 17.8% relative speaker error rate reduction over conventional methods.
arXiv Detail & Related papers (2021-02-12T12:02:03Z) - Filling the Gap of Utterance-aware and Speaker-aware Representation for
Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.
In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely.
We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z) - Contextual Dialogue Act Classification for Open-Domain Conversational
Agents [10.576497782941697]
Classifying the general intent of the user utterance in a conversation, also known as Dialogue Act (DA), is a key step in Natural Language Understanding (NLU) for conversational agents.
We propose CDAC (Contextual Dialogue Act), a simple yet effective deep learning approach for contextual dialogue act classification.
We use transfer learning to adapt models trained on human-human conversations to predict dialogue acts in human-machine dialogues.
arXiv Detail & Related papers (2020-05-28T06:48:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.