Enhancing Semantic Understanding with Self-supervised Methods for
Abstractive Dialogue Summarization
- URL: http://arxiv.org/abs/2209.00278v1
- Date: Thu, 1 Sep 2022 07:51:46 GMT
- Title: Enhancing Semantic Understanding with Self-supervised Methods for
Abstractive Dialogue Summarization
- Authors: Hyunjae Lee, Jaewoong Yun, Hyunjin Choi, Seongho Joe, Youngjune L.
Gwon
- Abstract summary: We introduce self-supervised methods to compensate shortcomings to train a dialogue summarization model.
Our principle is to detect incoherent information flows using pretext dialogue text to enhance BERT's ability to contextualize the dialogue text representations.
- Score: 4.226093500082746
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contextualized word embeddings can lead to state-of-the-art performances in
natural language understanding. Recently, a pre-trained deep contextualized
text encoder such as BERT has shown its potential in improving natural language
tasks including abstractive summarization. Existing approaches in dialogue
summarization focus on incorporating a large language model into summarization
task trained on large-scale corpora consisting of news articles rather than
dialogues of multiple speakers. In this paper, we introduce self-supervised
methods to compensate shortcomings to train a dialogue summarization model. Our
principle is to detect incoherent information flows using pretext dialogue text
to enhance BERT's ability to contextualize the dialogue text representations.
We build and fine-tune an abstractive dialogue summarization model on a shared
encoder-decoder architecture using the enhanced BERT. We empirically evaluate
our abstractive dialogue summarizer with the SAMSum corpus, a recently
introduced dataset with abstractive dialogue summaries. All of our methods have
contributed improvements to abstractive summary measured in ROUGE scores.
Through an extensive ablation study, we also present a sensitivity analysis to
critical model hyperparameters, probabilities of switching utterances and
masking interlocutors.
Related papers
- Increasing faithfulness in human-human dialog summarization with Spoken Language Understanding tasks [0.0]
We propose an exploration of how incorporating task-related information can enhance the summarization process.
Results show that integrating models with task-related information improves summary accuracy, even with varying word error rates.
arXiv Detail & Related papers (2024-09-16T08:15:35Z) - Instructive Dialogue Summarization with Query Aggregations [41.89962538701501]
We introduce instruction-finetuned language models to expand the capability set of dialogue summarization models.
We propose a three-step approach to synthesize high-quality query-based summarization triples.
By training a unified model called InstructDS on three summarization datasets with multi-purpose instructive triples, we expand the capability of dialogue summarization models.
arXiv Detail & Related papers (2023-10-17T04:03:00Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension [42.57581945778631]
Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing.
We propose a novel type of dialogue summarization task - STRUctured DiaLoguE Summarization.
We show that our STRUDEL dialogue comprehension model can significantly improve the dialogue comprehension performance of transformer encoder language models.
arXiv Detail & Related papers (2022-12-24T04:39:54Z) - DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization [127.714919036388]
DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain.
Our experiments show that DIONYSUS outperforms existing methods on six datasets.
arXiv Detail & Related papers (2022-12-20T06:21:21Z) - Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem.
Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z) - Controllable Abstractive Dialogue Summarization with Sketch Supervision [56.59357883827276]
Our model achieves state-of-the-art performance on the largest dialogue summarization corpus SAMSum, with as high as 50.79 in ROUGE-L score.
arXiv Detail & Related papers (2021-05-28T19:05:36Z) - Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features.
To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives.
Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z) - I like fish, especially dolphins: Addressing Contradictions in Dialogue
Modeling [104.09033240889106]
We introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues.
We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
arXiv Detail & Related papers (2020-12-24T18:47:49Z) - Incorporating Commonsense Knowledge into Abstractive Dialogue
Summarization via Heterogeneous Graph Networks [34.958271247099]
We present a novel multi-speaker dialogue summarizer to demonstrate how large-scale commonsense knowledge can facilitate dialogue understanding and summary generation.
We consider utterance and commonsense knowledge as two different types of data and design a Dialogue Heterogeneous Graph Network (D-HGN) for modeling both information.
arXiv Detail & Related papers (2020-10-20T05:44:55Z) - Multi-View Sequence-to-Sequence Models with Conversational Structure for
Abstractive Dialogue Summarization [72.54873655114844]
Text summarization is one of the most challenging and interesting problems in NLP.
This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations.
Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment.
arXiv Detail & Related papers (2020-10-04T20:12:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.