Multi-Stage Pre-training Enhanced by ChatGPT for Multi-Scenario
Multi-Domain Dialogue Summarization
- URL: http://arxiv.org/abs/2310.10285v1
- Date: Mon, 16 Oct 2023 11:16:07 GMT
- Title: Multi-Stage Pre-training Enhanced by ChatGPT for Multi-Scenario
Multi-Domain Dialogue Summarization
- Authors: Weixiao Zhou, Gengyao Li, Xianfu Cheng, Xinnian Liang, Junnan Zhu,
Feifei Zhai and Zhoujun Li
- Abstract summary: We propose a new pre-trained model specifically designed for multi-scenario multi-domain dialogue summarization.
It adopts a multi-stage pre-training strategy to reduce the gap between the pre-training objective and fine-tuning objective.
- Score: 20.60018442168502
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dialogue summarization involves a wide range of scenarios and domains.
However, existing methods generally only apply to specific scenarios or
domains. In this study, we propose a new pre-trained model specifically
designed for multi-scenario multi-domain dialogue summarization. It adopts a
multi-stage pre-training strategy to reduce the gap between the pre-training
objective and fine-tuning objective. Specifically, we first conduct
domain-aware pre-training using large-scale multi-scenario multi-domain
dialogue data to enhance the adaptability of our pre-trained model. Then, we
conduct task-oriented pre-training using large-scale multi-scenario
multi-domain "dialogue-summary" parallel data annotated by ChatGPT to enhance
the dialogue summarization ability of our pre-trained model. Experimental
results on three dialogue summarization datasets from different scenarios and
domains indicate that our pre-trained model significantly outperforms previous
state-of-the-art models in full fine-tuning, zero-shot, and few-shot settings.
Related papers
- Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition [10.36399200974439]
We introduce a novel method combining multi-modal and multi-task unsupervised pre-training with a translation-based supervised mid-training approach.
We empirically demonstrate that such a multi-stage approach leads to relative word error rate (WER) improvements of up to 38.45% over baselines on both Librispeech and SUPERB.
arXiv Detail & Related papers (2024-03-28T20:23:39Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - Stabilized In-Context Learning with Pre-trained Language Models for Few
Shot Dialogue State Tracking [57.92608483099916]
Large pre-trained language models (PLMs) have shown impressive unaided performance across many NLP tasks.
For more complex tasks such as dialogue state tracking (DST), designing prompts that reliably convey the desired intent is nontrivial.
We introduce a saliency model to limit dialogue text length, allowing us to include more exemplars per query.
arXiv Detail & Related papers (2023-02-12T15:05:10Z) - DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization [127.714919036388]
DIONYSUS is a pre-trained encoder-decoder model for summarizing dialogues in any new domain.
Our experiments show that DIONYSUS outperforms existing methods on six datasets.
arXiv Detail & Related papers (2022-12-20T06:21:21Z) - Towards All-in-one Pre-training via Maximizing Multi-modal Mutual
Information [77.80071279597665]
We propose an all-in-one single-stage pre-training approach, named Maximizing Multi-modal Mutual Information Pre-training (M3I Pre-training)
Our approach achieves better performance than previous pre-training methods on various vision benchmarks, including ImageNet classification, object detection, LVIS long-tailed object detection, and ADE20k semantic segmentation.
arXiv Detail & Related papers (2022-11-17T18:59:49Z) - Multi-Task Learning for Situated Multi-Domain End-to-End Dialogue
Systems [21.55075825370981]
We leverage multi-task learning techniques to train a GPT-2 based model on a more challenging dataset.
Our method achieves better performance on all sub-tasks, across domains, compared to task and domain-specific models.
arXiv Detail & Related papers (2021-10-11T12:36:30Z) - Low-Resource Dialogue Summarization with Domain-Agnostic Multi-Source
Pretraining [10.750492932503649]
Training a large summarization model is generally infeasible due to the inadequacy of dialogue data with annotated summaries.
We propose a multi-source pretraining paradigm to better leverage the external summary data.
Our approach achieves competitive performance and generalizes well in different dialogue scenarios.
arXiv Detail & Related papers (2021-09-09T07:47:16Z) - RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
Task-oriented Dialog Systems [75.87418236410296]
We introduce the RADDLE benchmark, a collection of corpora and tools for evaluating the performance of models across a diverse set of domains.
RADDLE is designed to favor and encourage models with a strong generalization ability.
We evaluate recent state-of-the-art systems based on pre-training and fine-tuning, and find that grounded pre-training on heterogeneous dialog corpora performs better than training a separate model per domain.
arXiv Detail & Related papers (2020-12-29T08:58:49Z) - A Tailored Pre-Training Model for Task-Oriented Dialog Generation [60.05269529832447]
We propose a Pre-trained Role Alternating Language model (PRAL) for task-oriented conversational systems.
We introduce a task-oriented dialog pretraining dataset by cleaning 13 existing data sets.
The results show that PRAL performs better or on par with state-of-the-art methods.
arXiv Detail & Related papers (2020-04-24T09:25:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.