Related papers: Enhancing Semantic Understanding with Self-supervised Methods for Abstractive Dialogue Summarization

Enhancing Semantic Understanding with Self-supervised Methods for Abstractive Dialogue Summarization

URL: http://arxiv.org/abs/2209.00278v1
Date: Thu, 1 Sep 2022 07:51:46 GMT
Title: Enhancing Semantic Understanding with Self-supervised Methods for Abstractive Dialogue Summarization
Authors: Hyunjae Lee, Jaewoong Yun, Hyunjin Choi, Seongho Joe, Youngjune L. Gwon
Abstract summary: We introduce self-supervised methods to compensate shortcomings to train a dialogue summarization model. Our principle is to detect incoherent information flows using pretext dialogue text to enhance BERT's ability to contextualize the dialogue text representations.
Score: 4.226093500082746
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Contextualized word embeddings can lead to state-of-the-art performances in natural language understanding. Recently, a pre-trained deep contextualized text encoder such as BERT has shown its potential in improving natural language tasks including abstractive summarization. Existing approaches in dialogue summarization focus on incorporating a large language model into summarization task trained on large-scale corpora consisting of news articles rather than dialogues of multiple speakers. In this paper, we introduce self-supervised methods to compensate shortcomings to train a dialogue summarization model. Our principle is to detect incoherent information flows using pretext dialogue text to enhance BERT's ability to contextualize the dialogue text representations. We build and fine-tune an abstractive dialogue summarization model on a shared encoder-decoder architecture using the enhanced BERT. We empirically evaluate our abstractive dialogue summarizer with the SAMSum corpus, a recently introduced dataset with abstractive dialogue summaries. All of our methods have contributed improvements to abstractive summary measured in ROUGE scores. Through an extensive ablation study, we also present a sensitivity analysis to critical model hyperparameters, probabilities of switching utterances and masking interlocutors.

Related papers

On Mitigating Data Sparsity in Conversational Recommender Systems [69.70761335240738]
Conversational recommender systems (CRSs) capture user preference through textual information in dialogues.<n>They suffer from data sparsity on two fronts: the dialogue space is vast and linguistically diverse, while the item space exhibits long-tail and sparse distributions.<n>Existing methods struggle with (1) generalizing to varied dialogue expressions due to underutilization of rich textual cues, and (2) learning informative item representations under severe sparsity.
arXiv Detail & Related papers (2025-07-01T06:54:51Z)
Aligning Spoken Dialogue Models from User Interactions [55.192134724622235]
We propose a novel preference alignment framework to improve spoken dialogue models on realtime conversations from user interactions.<n>We create a dataset of more than 150,000 preference pairs from raw multi-turn speech conversations annotated with AI feedback.<n>Our findings shed light on the importance of a well-calibrated balance among various dynamics, crucial for natural real-time speech dialogue systems.
arXiv Detail & Related papers (2025-06-26T16:45:20Z)
Detecting Referring Expressions in Visually Grounded Dialogue with Autoregressive Language Models [3.8673630752805446]
The aim is to investigate the extent to which the linguistic context alone can inform the detection of mentions.<n>We adapt a pretrained large language model (LLM) to perform a relatively course-grained annotation of mention spans in unfolding conversations.<n>Our findings indicate that even when using a moderately sized LLM, relatively small datasets, and parameter-efficient fine-tuning, a text-only approach can be effective.
arXiv Detail & Related papers (2025-06-26T14:14:20Z)
A Multi-view Discourse Framework for Integrating Semantic and Syntactic Features in Dialog Agents [0.0]
Multiturn dialogue models aim to generate human-like responses by leveraging conversational context. Existing methods often neglect the interactions between these utterances or treat all of them as equally significant. This paper introduces a discourse-aware framework for response selection in retrieval-based dialogue systems.
arXiv Detail & Related papers (2025-04-12T04:22:18Z)
Increasing faithfulness in human-human dialog summarization with Spoken Language Understanding tasks [0.0]
We propose an exploration of how incorporating task-related information can enhance the summarization process. Results show that integrating models with task-related information improves summary accuracy, even with varying word error rates.
arXiv Detail & Related papers (2024-09-16T08:15:35Z)
Instructive Dialogue Summarization with Query Aggregations [41.89962538701501]
We introduce instruction-finetuned language models to expand the capability set of dialogue summarization models. We propose a three-step approach to synthesize high-quality query-based summarization triples. By training a unified model called InstructDS on three summarization datasets with multi-purpose instructive triples, we expand the capability of dialogue summarization models.
arXiv Detail & Related papers (2023-10-17T04:03:00Z)
Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z)
STRUDEL: Structured Dialogue Summarization for Dialogue Comprehension [42.57581945778631]
Abstractive dialogue summarization has long been viewed as an important standalone task in natural language processing. We propose a novel type of dialogue summarization task - STRUctured DiaLoguE Summarization. We show that our STRUDEL dialogue comprehension model can significantly improve the dialogue comprehension performance of transformer encoder language models.
arXiv Detail & Related papers (2022-12-24T04:39:54Z)
DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization [127.714919036388]
DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain. Our experiments show that DIONYSUS outperforms existing methods on six datasets.
arXiv Detail & Related papers (2022-12-20T06:21:21Z)
Utterance Rewriting with Contrastive Learning in Multi-turn Dialogue [22.103162555263143]
We introduce contrastive learning and multi-task learning to jointly model the problem. Our proposed model achieves state-of-the-art performance on several public datasets.
arXiv Detail & Related papers (2022-03-22T10:13:27Z)
Controllable Abstractive Dialogue Summarization with Sketch Supervision [56.59357883827276]
Our model achieves state-of-the-art performance on the largest dialogue summarization corpus SAMSum, with as high as 50.79 in ROUGE-L score.
arXiv Detail & Related papers (2021-05-28T19:05:36Z)
Structural Pre-training for Dialogue Comprehension [51.215629336320305]
We present SPIDER, Structural Pre-traIned DialoguE Reader, to capture dialogue exclusive features. To simulate the dialogue-like features, we propose two training objectives in addition to the original LM objectives. Experimental results on widely used dialogue benchmarks verify the effectiveness of the newly introduced self-supervised tasks.
arXiv Detail & Related papers (2021-05-23T15:16:54Z)
I like fish, especially dolphins: Addressing Contradictions in Dialogue Modeling [104.09033240889106]
We introduce the DialoguE COntradiction DEtection task (DECODE) and a new conversational dataset containing both human-human and human-bot contradictory dialogues. We then compare a structured utterance-based approach of using pre-trained Transformer models for contradiction detection with the typical unstructured approach.
arXiv Detail & Related papers (2020-12-24T18:47:49Z)
Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks [34.958271247099]
We present a novel multi-speaker dialogue summarizer to demonstrate how large-scale commonsense knowledge can facilitate dialogue understanding and summary generation. We consider utterance and commonsense knowledge as two different types of data and design a Dialogue Heterogeneous Graph Network (D-HGN) for modeling both information.
arXiv Detail & Related papers (2020-10-20T05:44:55Z)
Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization [72.54873655114844]
Text summarization is one of the most challenging and interesting problems in NLP. This work proposes a multi-view sequence-to-sequence model by first extracting conversational structures of unstructured daily chats from different views to represent conversations. Experiments on a large-scale dialogue summarization corpus demonstrated that our methods significantly outperformed previous state-of-the-art models via both automatic evaluations and human judgment.
arXiv Detail & Related papers (2020-10-04T20:12:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.