SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization
- URL: http://arxiv.org/abs/2401.17597v1
- Date: Wed, 31 Jan 2024 04:50:00 GMT
- Title: SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization
- Authors: Sangwoo Cho, Kaiqiang Song, Chao Zhao, Xiaoyang Wang, Dong Yu
- Abstract summary: Multi-turn dialogues are characterized by their extended length and the presence of turn-taking conversations.
Traditional language models often overlook the distinct features of these dialogues by treating them as regular text.
We propose a speaker-enhanced pre-training method for long dialogue summarization.
- Score: 48.284512017469524
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-turn dialogues are characterized by their extended length and the
presence of turn-taking conversations. Traditional language models often
overlook the distinct features of these dialogues by treating them as regular
text. In this paper, we propose a speaker-enhanced pre-training method for long
dialogue summarization, which leverages the inherent structure of multiple-turn
dialogues. To support our study, we curate a diverse dataset that includes
transcripts from real-world scenarios, movie or TV show transcripts, and
dialogues generated by a Large Language Model. We then perform a pre-training,
which encompasses the detection of speaker changes, and masked utterance
generation. Experimental results of fine-tuned models demonstrate that our
model achieves state-of-the-art performance on downstream benchmarks with long
context, surpassing baseline models and highlighting the effectiveness of our
approach. Our findings highlight the importance of curating pre-training
datasets that exhibit diversity and variations in length distribution to ensure
effective alignment with downstream datasets.
Related papers
- Investigating the Effects of Large-Scale Pseudo-Stereo Data and Different Speech Foundation Model on Dialogue Generative Spoken Language Model [47.67067056593085]
We develop a pipeline capable of transforming single-channel dialogue data into pseudo-stereo data.
This expanded our training dataset from a mere 2,000 to an impressive 17,600 hours.
The inclusion of this pseudo-stereo data has proven to be effective in improving the performance of spoken dialogue language models.
arXiv Detail & Related papers (2024-07-02T03:22:41Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Channel-aware Decoupling Network for Multi-turn Dialogue Comprehension [81.47133615169203]
We propose compositional learning for holistic interaction across utterances beyond the sequential contextualization from PrLMs.
We employ domain-adaptive training strategies to help the model adapt to the dialogue domains.
Experimental results show that our method substantially boosts the strong PrLM baselines in four public benchmark datasets.
arXiv Detail & Related papers (2023-01-10T13:18:25Z) - Post-Training Dialogue Summarization using Pseudo-Paraphrasing [12.083992819138716]
We propose to post-train pretrained language models (PLMs) to rephrase from dialogue to narratives.
Comprehensive experiments show that our approach significantly improves vanilla PLMs on dialogue summarization.
arXiv Detail & Related papers (2022-04-28T13:42:19Z) - "How Robust r u?": Evaluating Task-Oriented Dialogue Systems on Spoken
Conversations [87.95711406978157]
This work presents a new benchmark on spoken task-oriented conversations.
We study multi-domain dialogue state tracking and knowledge-grounded dialogue modeling.
Our data set enables speech-based benchmarking of task-oriented dialogue systems.
arXiv Detail & Related papers (2021-09-28T04:51:04Z) - Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance
for Multi-party Dialogue Reading Comprehension [46.69961067676279]
Multi-party dialogue machine reading comprehension (MRC) brings tremendous challenge since it involves multiple speakers at one dialogue.
Previous models focus on how to incorporate speaker information flows using complex graph-based modules.
In this paper, we design two labour-free self- and pseudo-self-supervised prediction tasks on speaker and key-utterance to implicitly model the speaker information flows.
arXiv Detail & Related papers (2021-09-08T16:51:41Z) - Filling the Gap of Utterance-aware and Speaker-aware Representation for
Multi-turn Dialogue [76.88174667929665]
A multi-turn dialogue is composed of multiple utterances from two or more different speaker roles.
In the existing retrieval-based multi-turn dialogue modeling, the pre-trained language models (PrLMs) as encoder represent the dialogues coarsely.
We propose a novel model to fill such a gap by modeling the effective utterance-aware and speaker-aware representations entailed in a dialogue history.
arXiv Detail & Related papers (2020-09-14T15:07:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.