PANGUBOT: Efficient Generative Dialogue Pre-training from Pre-trained
Language Model
- URL: http://arxiv.org/abs/2203.17090v1
- Date: Thu, 31 Mar 2022 15:09:12 GMT
- Title: PANGUBOT: Efficient Generative Dialogue Pre-training from Pre-trained
Language Model
- Authors: Fei Mi, Yitong Li, Yulong Zeng, Jingyan Zhou, Yasheng Wang, Chuanfei
Xu, Lifeng Shang, Xin Jiang, Shiqi Zhao, Qun Liu
- Abstract summary: We introduce PANGUBOT, a Chinese pre-trained open-domain dialogue generation model based on a large pre-trained language model (PLM) PANGU-alpha.
We show that PANGUBOT outperforms state-of-the-art Chinese dialogue systems.
We also demonstrate that PANGUBOT can be easily deployed to generate emotional responses without further training.
- Score: 47.858326419602115
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce PANGUBOT, a Chinese pre-trained open-domain
dialogue generation model based on a large pre-trained language model (PLM)
PANGU-alpha (Zeng et al.,2021). Different from other pre-trained dialogue
models trained over a massive amount of dialogue data from scratch, we aim to
build a powerful dialogue model with relatively fewer data and computation
costs by inheriting valuable language capabilities and knowledge from PLMs. To
this end, we train PANGUBOT from the large PLM PANGU-alpha, which has been
proven well-performed on a variety of Chinese natural language tasks. We
investigate different aspects of responses generated by PANGUBOT, including
response quality, knowledge, and safety. We show that PANGUBOT outperforms
state-of-the-art Chinese dialogue systems (CDIALGPT (Wang et al., 2020), EVA
(Zhou et al., 2021)) w.r.t. the above three aspects. We also demonstrate that
PANGUBOT can be easily deployed to generate emotional responses without further
training. Throughout our empirical analysis, we also point out that the
PANGUBOT response quality, knowledge correctness, and safety are still far from
perfect, and further explorations are indispensable to building reliable and
smart dialogue systems.
Related papers
- FutureTOD: Teaching Future Knowledge to Pre-trained Language Model for
Task-Oriented Dialogue [20.79359173822053]
We propose a novel dialogue pre-training model, FutureTOD, which distills future knowledge to the representation of the previous dialogue context.
Our intuition is that a good dialogue representation both learns local context information and predicts future information.
arXiv Detail & Related papers (2023-06-17T10:40:07Z) - KPT: Keyword-guided Pre-training for Grounded Dialog Generation [82.68787152707455]
We propose KPT (guided Pre-Training), a novel self-supervised pre-training method for grounded dialog generation.
Specifically, we use a pre-trained language model to extract the most uncertain tokens in the dialog as keywords.
We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages.
arXiv Detail & Related papers (2022-12-04T04:05:01Z) - Building a Personalized Dialogue System with Prompt-Tuning [5.942602139622984]
We build a dialogue system that responds based on a given character setting (persona)
We propose an approach that uses prompt-tuning, which has low learning costs, on pre-trained large-scale language models.
arXiv Detail & Related papers (2022-06-11T02:21:11Z) - EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with
Large-Scale Pre-Training [73.98154158068134]
EVA2.0 is a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters.
We propose EVA2.0, a large-scale pre-trained open-domain Chinese dialogue model with 2.8 billion parameters, and will make our models and codes publicly available.
arXiv Detail & Related papers (2022-03-17T13:33:17Z) - Response Generation with Context-Aware Prompt Learning [19.340498579331555]
We present a novel approach for pre-trained dialogue modeling that casts the dialogue generation problem as a prompt-learning task.
Instead of fine-tuning on limited dialogue data, our approach, DialogPrompt, learns continuous prompt embeddings optimized for dialogue contexts.
Our approach significantly outperforms the fine-tuning baseline and the generic prompt-learning methods.
arXiv Detail & Related papers (2021-11-04T05:40:13Z) - EVA: An Open-Domain Chinese Dialogue System with Large-Scale Generative
Pre-Training [40.85554509137999]
We propose EVA, a Chinese dialogue system that contains the largest Chinese pre-trained dialogue model with 2.8B parameters.
To build this model, we collect the largest Chinese dialogue dataset named WDC-Dialogue from various public social media.
Experiments on automatic and human evaluation show that EVA outperforms other Chinese pre-trained dialogue models.
arXiv Detail & Related papers (2021-08-03T14:55:24Z) - Robustness Testing of Language Understanding in Dialog Systems [33.30143655553583]
We conduct comprehensive evaluation and analysis with respect to the robustness of natural language understanding models.
We introduce three important aspects related to language understanding in real-world dialog systems, namely, language variety, speech characteristics, and noise perturbation.
We propose a model-agnostic toolkit LAUG to approximate natural perturbation for testing the robustness issues in dialog systems.
arXiv Detail & Related papers (2020-12-30T18:18:47Z) - Knowledge-Grounded Dialogue Generation with Pre-trained Language Models [74.09352261943911]
We study knowledge-grounded dialogue generation with pre-trained language models.
We propose equipping response generation defined by a pre-trained language model with a knowledge selection module.
arXiv Detail & Related papers (2020-10-17T16:49:43Z) - TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented
Dialogue [113.45485470103762]
In this work, we unify nine human-human and multi-turn task-oriented dialogue datasets for language modeling.
To better model dialogue behavior during pre-training, we incorporate user and system tokens into the masked language modeling.
arXiv Detail & Related papers (2020-04-15T04:09:05Z) - Low-Resource Knowledge-Grounded Dialogue Generation [74.09352261943913]
We consider knowledge-grounded dialogue generation under a natural assumption that only limited training examples are available.
We devise a disentangled response decoder in order to isolate parameters that depend on knowledge-grounded dialogues from the entire generation model.
With only 1/8 training data, our model can achieve the state-of-the-art performance and generalize well on out-of-domain knowledge.
arXiv Detail & Related papers (2020-02-24T16:20:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.