Language Model as an Annotator: Exploring DialoGPT for Dialogue
Summarization
- URL: http://arxiv.org/abs/2105.12544v2
- Date: Fri, 28 May 2021 01:34:49 GMT
- Title: Language Model as an Annotator: Exploring DialoGPT for Dialogue
Summarization
- Authors: Xiachong Feng, Xiaocheng Feng, Libo Qin, Bing Qin, Ting Liu
- Abstract summary: We show how DialoGPT, a pre-trained model for conversational response generation, can be developed as an unsupervised dialogue annotator.
We apply DialoGPT to label three types of features on two dialogue summarization datasets, SAMSum and AMI, and employ pre-trained and non pre-trained models as our summarizes.
- Score: 29.887562761942114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current dialogue summarization systems usually encode the text with a number
of general semantic features (e.g., keywords and topics) to gain more powerful
dialogue modeling capabilities. However, these features are obtained via
open-domain toolkits that are dialog-agnostic or heavily relied on human
annotations. In this paper, we show how DialoGPT, a pre-trained model for
conversational response generation, can be developed as an unsupervised
dialogue annotator, which takes advantage of dialogue background knowledge
encoded in DialoGPT. We apply DialoGPT to label three types of features on two
dialogue summarization datasets, SAMSum and AMI, and employ pre-trained and non
pre-trained models as our summarizes. Experimental results show that our
proposed method can obtain remarkable improvements on both datasets and
achieves new state-of-the-art performance on the SAMSum dataset.
Related papers
- SPECTRUM: Speaker-Enhanced Pre-Training for Long Dialogue Summarization [48.284512017469524]
Multi-turn dialogues are characterized by their extended length and the presence of turn-taking conversations.
Traditional language models often overlook the distinct features of these dialogues by treating them as regular text.
We propose a speaker-enhanced pre-training method for long dialogue summarization.
arXiv Detail & Related papers (2024-01-31T04:50:00Z) - SPACE-2: Tree-Structured Semi-Supervised Contrastive Pre-training for
Task-Oriented Dialog Understanding [68.94808536012371]
We propose a tree-structured pre-trained conversation model, which learns dialog representations from limited labeled dialogs and large-scale unlabeled dialog corpora.
Our method can achieve new state-of-the-art results on the DialoGLUE benchmark consisting of seven datasets and four popular dialog understanding tasks.
arXiv Detail & Related papers (2022-09-14T13:42:50Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Learning Locality and Isotropy in Dialogue Modeling [28.743212772593335]
We propose a simple method for dialogue representation calibration, namely SimDRC, to build isotropic and conversational feature spaces.
Experimental results show that our approach significantly outperforms the current state-of-the-art models on three dialogue tasks.
arXiv Detail & Related papers (2022-05-29T06:48:53Z) - DialogZoo: Large-Scale Dialog-Oriented Task Learning [52.18193690394549]
We aim to build a unified foundation model which can solve massive diverse dialogue tasks.
To achieve this goal, we first collect a large-scale well-labeled dialogue dataset from 73 publicly available datasets.
arXiv Detail & Related papers (2022-05-25T11:17:16Z) - Post-Training Dialogue Summarization using Pseudo-Paraphrasing [12.083992819138716]
We propose to post-train pretrained language models (PLMs) to rephrase from dialogue to narratives.
Comprehensive experiments show that our approach significantly improves vanilla PLMs on dialogue summarization.
arXiv Detail & Related papers (2022-04-28T13:42:19Z) - Back to the Future: Bidirectional Information Decoupling Network for
Multi-turn Dialogue Modeling [80.51094098799736]
We propose Bidirectional Information Decoupling Network (BiDeN) as a universal dialogue encoder.
BiDeN explicitly incorporates both the past and future contexts and can be generalized to a wide range of dialogue-related tasks.
Experimental results on datasets of different downstream tasks demonstrate the universality and effectiveness of our BiDeN.
arXiv Detail & Related papers (2022-04-18T03:51:46Z) - GALAXY: A Generative Pre-trained Model for Task-Oriented Dialog with
Semi-Supervised Learning and Explicit Policy Injection [36.77204909711832]
We propose a novel pre-trained dialog model that explicitly learns dialog policy from limited labeled dialogs and large-scale unlabeled dialog corpora.
Specifically, we introduce a dialog act prediction task for policy optimization during pre-training and employ a consistency regularization term to refine the learned representation.
Empirical results show that GALAXY substantially improves the performance of task-oriented dialog systems.
arXiv Detail & Related papers (2021-11-29T15:24:36Z) - Variational Hierarchical Dialog Autoencoder for Dialog State Tracking
Data Augmentation [59.174903564894954]
In this work, we extend this approach to the task of dialog state tracking for goal-oriented dialogs.
We propose the Variational Hierarchical Dialog Autoencoder (VHDA) for modeling the complete aspects of goal-oriented dialogs.
Experiments on various dialog datasets show that our model improves the downstream dialog trackers' robustness via generative data augmentation.
arXiv Detail & Related papers (2020-01-23T15:34:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.