DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization
- URL: http://arxiv.org/abs/2212.10018v2
- Date: Fri, 26 May 2023 17:29:01 GMT
- Title: DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization
- Authors: Yu Li, Baolin Peng, Pengcheng He, Michel Galley, Zhou Yu and Jianfeng
Gao
- Abstract summary: DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain.
Our experiments show that DIONYSUS outperforms existing methods on six datasets.
- Score: 127.714919036388
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dialogue summarization has recently garnered significant attention due to its
wide range of applications. However, existing methods for summarizing dialogues
have limitations because they do not take into account the inherent structure
of dialogue and rely heavily on labeled data, which can lead to poor
performance in new domains. In this work, we propose DIONYSUS (dynamic input
optimization in pre-training for dialogue summarization), a pre-trained
encoder-decoder model for summarizing dialogues in any new domain. To pre-train
DIONYSUS, we create two pseudo summaries for each dialogue example: one is
produced by a fine-tuned summarization model, and the other is a collection of
dialogue turns that convey important information. We then choose one of these
pseudo summaries based on the difference in information distribution across
different types of dialogues. This selected pseudo summary serves as the
objective for pre-training DIONYSUS using a self-supervised approach on a large
dialogue corpus. Our experiments show that DIONYSUS outperforms existing
methods on six datasets, as demonstrated by its ROUGE scores in zero-shot and
few-shot settings.
Related papers
- OPAL: Ontology-Aware Pretrained Language Model for End-to-End
Task-Oriented Dialogue [40.62090743056549]
This paper presents an ontology-aware pretrained language model (OPAL) for end-to-end task-oriented dialogue (TOD)
Unlike chit-chat dialogue models, task-oriented dialogue models fulfill at least two task-specific modules: dialogue state tracker (DST) and response generator (RG)
arXiv Detail & Related papers (2022-09-10T04:38:27Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - Post-Training Dialogue Summarization using Pseudo-Paraphrasing [12.083992819138716]
We propose to post-train pretrained language models (PLMs) to rephrase from dialogue to narratives.
Comprehensive experiments show that our approach significantly improves vanilla PLMs on dialogue summarization.
arXiv Detail & Related papers (2022-04-28T13:42:19Z) - DialogVED: A Pre-trained Latent Variable Encoder-Decoder Model for
Dialog Response Generation [80.45816053153722]
DialogVED introduces continuous latent variables into the enhanced encoder-decoder pre-training framework to increase the relevance and diversity of responses.
We conduct experiments on PersonaChat, DailyDialog, and DSTC7-AVSD benchmarks for response generation.
arXiv Detail & Related papers (2022-04-27T16:18:15Z) - In-Context Learning for Few-Shot Dialogue State Tracking [55.91832381893181]
We propose an in-context (IC) learning framework for few-shot dialogue state tracking (DST)
A large pre-trained language model (LM) takes a test instance and a few annotated examples as input, and directly decodes the dialogue states without any parameter updates.
This makes the LM more flexible and scalable compared to prior few-shot DST work when adapting to new domains and scenarios.
arXiv Detail & Related papers (2022-03-16T11:58:24Z) - An Exploratory Study on Long Dialogue Summarization: What Works and
What's Next [33.1899354772074]
We study long dialogue summarization by investigating three strategies to deal with the lengthy input problem and locate relevant information.
Our experimental results on three long dialogue datasets (QMSum, MediaSum, SummScreen) show that the retrieve-then-summarize pipeline models yield the best performance.
arXiv Detail & Related papers (2021-09-10T01:38:26Z) - Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source
Pretraining [10.750492932503649]
Training a large summarization model is generally infeasible due to the inadequacy of dialogue data with annotated summaries.
We propose a multi-source pretraining paradigm to better leverage the external summary data.
Our approach achieves competitive performance and generalizes well in different dialogue scenarios.
arXiv Detail & Related papers (2021-09-09T07:47:16Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - Dialogue-Based Relation Extraction [53.2896545819799]
We present the first human-annotated dialogue-based relation extraction (RE) dataset DialogRE.
We argue that speaker-related information plays a critical role in the proposed task, based on an analysis of similarities and differences between dialogue-based and traditional RE tasks.
Experimental results demonstrate that a speaker-aware extension on the best-performing model leads to gains in both the standard and conversational evaluation settings.
arXiv Detail & Related papers (2020-04-17T03:51:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.