What Helps Transformers Recognize Conversational Structure? Importance
of Context, Punctuation, and Labels in Dialog Act Recognition
- URL: http://arxiv.org/abs/2107.02294v1
- Date: Mon, 5 Jul 2021 21:56:00 GMT
- Title: What Helps Transformers Recognize Conversational Structure? Importance
of Context, Punctuation, and Labels in Dialog Act Recognition
- Authors: Piotr \.Zelasko, Raghavendra Pappagari, Najim Dehak
- Abstract summary: We apply two pre-trained transformer models to structure a conversational transcript as a sequence of dialog acts.
We find that the inclusion of a broader conversational context helps disambiguate many dialog act classes.
A detailed analysis reveals specific segmentation patterns observed in its absence.
- Score: 41.1669799542627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dialog acts can be interpreted as the atomic units of a conversation, more
fine-grained than utterances, characterized by a specific communicative
function. The ability to structure a conversational transcript as a sequence of
dialog acts -- dialog act recognition, including the segmentation -- is
critical for understanding dialog. We apply two pre-trained transformer models,
XLNet and Longformer, to this task in English and achieve strong results on
Switchboard Dialog Act and Meeting Recorder Dialog Act corpora with dialog act
segmentation error rates (DSER) of 8.4% and 14.2%. To understand the key
factors affecting dialog act recognition, we perform a comparative analysis of
models trained under different conditions. We find that the inclusion of a
broader conversational context helps disambiguate many dialog act classes,
especially those infrequent in the training data. The presence of punctuation
in the transcripts has a massive effect on the models' performance, and a
detailed analysis reveals specific segmentation patterns observed in its
absence. Finally, we find that the label set specificity does not affect dialog
act segmentation performance. These findings have significant practical
implications for spoken language understanding applications that depend heavily
on a good-quality segmentation being available.
Related papers
- Multi-turn Dialogue Comprehension from a Topic-aware Perspective [70.37126956655985]
This paper proposes to model multi-turn dialogues from a topic-aware perspective.
We use a dialogue segmentation algorithm to split a dialogue passage into topic-concentrated fragments in an unsupervised way.
We also present a novel model, Topic-Aware Dual-Attention Matching (TADAM) Network, which takes topic segments as processing elements.
arXiv Detail & Related papers (2023-09-18T11:03:55Z) - SuperDialseg: A Large-scale Dataset for Supervised Dialogue Segmentation [55.82577086422923]
We provide a feasible definition of dialogue segmentation points with the help of document-grounded dialogues.
We release a large-scale supervised dataset called SuperDialseg, containing 9,478 dialogues.
We also provide a benchmark including 18 models across five categories for the dialogue segmentation task.
arXiv Detail & Related papers (2023-05-15T06:08:01Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - DialogueBERT: A Self-Supervised Learning based Dialogue Pre-training
Encoder [19.51263716065853]
We propose a novel contextual dialogue encoder (i.e. DialogueBERT) based on the popular pre-trained language model BERT.
Five self-supervised learning pre-training tasks are devised for learning the particularity of dialouge utterances.
DialogueBERT was pre-trained with 70 million dialogues in real scenario, and then fine-tuned in three different downstream dialogue understanding tasks.
arXiv Detail & Related papers (2021-09-22T01:41:28Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z) - Interview: A Large-Scale Open-Source Corpus of Media Dialog [11.28504775964698]
We introduce 'Interview': a large-scale (105K conversations) media dialog dataset collected from news interview transcripts.
Compared to existing large-scale proxies for conversational data, language models trained on our dataset exhibit better zero-shot out-of-domain performance.
'Interview' contains speaker role annotations for each turn, facilitating the development of engaging, responsive dialog systems.
arXiv Detail & Related papers (2020-04-07T02:44:50Z) - Local Contextual Attention with Hierarchical Structure for Dialogue Act
Recognition [14.81680798372891]
We design a hierarchical model based on self-attention to capture intra-sentence and inter-sentence information.
Based on the found that the length of dialog affects the performance, we introduce a new dialog segmentation mechanism.
arXiv Detail & Related papers (2020-03-12T22:26:11Z) - Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue
Representation Learning [50.5572111079898]
Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc.
While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive.
In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks.
arXiv Detail & Related papers (2020-02-27T04:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.