Related papers: A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation

A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation

URL: http://arxiv.org/abs/2204.09867v1
Date: Thu, 21 Apr 2022 03:49:54 GMT
Title: A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation
Authors: Yu Cao, Wei Bi, Meng Fang, Shuming Shi and Dacheng Tao
Abstract summary: It is expensive to scale up current persona-based dialogue datasets. Each data sample in this task is more complex to learn with than conventional dialogue data. We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
Score: 107.82729587882397
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Towards building intelligent dialogue agents, there has been a growing interest in introducing explicit personas in generation models. However, with limited persona-based dialogue data at hand, it may be difficult to train a dialogue generation model well. We point out that the data challenges of this generation task lie in two aspects: first, it is expensive to scale up current persona-based dialogue datasets; second, each data sample in this task is more complex to learn with than conventional dialogue data. To alleviate the above data issues, we propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model to improve its performance. The original training samples will first be distilled and thus expected to be fitted more easily. Next, we show various effective ways that can diversify such easier distilled data. A given base model will then be trained via the constructed data curricula, i.e. first on augmented distilled samples and then on original ones. Experiments illustrate the superiority of our method with two strong base dialogue models (Transformer encoder-decoder and GPT2).

Related papers

Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z)
q2d: Turning Questions into Dialogs to Teach Models How to Search [11.421839177607147]
We propose q2d: an automatic data generation pipeline that generates information-seeking dialogs from questions. Unlike previous approaches which relied on human written dialogs with search queries, our method allows to automatically generate query-based grounded dialogs with better control and scale.
arXiv Detail & Related papers (2023-04-27T16:39:15Z)
DialogZoo: Large-Scale Dialog-Oriented Task Learning [52.18193690394549]
We aim to build a unified foundation model which can solve massive diverse dialogue tasks. To achieve this goal, we first collect a large-scale well-labeled dialogue dataset from 73 publicly available datasets.
arXiv Detail & Related papers (2022-05-25T11:17:16Z)
Self-augmented Data Selection for Few-shot Dialogue Generation [18.794770678708637]
We adopt the self-training framework to deal with the few-shot MR-to-Text generation problem. We propose a novel data selection strategy to select the data that our generation model is most uncertain about.
arXiv Detail & Related papers (2022-05-19T16:25:50Z)
Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation [57.73547958927826]
We propose to refine the user dialogue history on a large scale, based on which we can handle more dialogue history and obtain more accurate persona information. Specifically, we design an MSP model which consists of three personal information refiners and a personalized response generator.
arXiv Detail & Related papers (2022-04-18T02:02:56Z)
Dual Task Framework for Debiasing Persona-grounded Dialogue Dataset [17.403065663306567]
We introduce a data-centric approach for the task of improving persona-conditioned dialogue agents. Specifically, we augment relevant personas to improve dialogue dataset/agent, by leveraging the primal-dual structure of the two tasks. Experiments on Persona-Chat show that our approach outperforms pre-trained LMs by an 11.7 point gain in terms of accuracy.
arXiv Detail & Related papers (2022-02-11T04:08:46Z)
Data-Efficient Methods for Dialogue Systems [4.061135251278187]
Conversational User Interface (CUI) has become ubiquitous in everyday life, in consumer-focused products like Siri and Alexa. Deep learning underlies many recent breakthroughs in dialogue systems but requires very large amounts of training data, often annotated by experts. In this thesis, we introduce a series of methods for training robust dialogue systems from minimal data.
arXiv Detail & Related papers (2020-12-05T02:51:09Z)
Pchatbot: A Large-Scale Dataset for Personalized Chatbot [49.16746174238548]
We introduce Pchatbot, a large-scale dialogue dataset that contains two subsets collected from Weibo and Judicial forums respectively. To adapt the raw dataset to dialogue systems, we elaborately normalize the raw dataset via processes such as anonymization. The scale of Pchatbot is significantly larger than existing Chinese datasets, which might benefit the data-driven models.
arXiv Detail & Related papers (2020-09-28T12:49:07Z)
Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data [61.71319905364992]
We propose a novel data augmentation method for training open-domain dialogue models by utilizing unpaired data. A data-level distillation process is first proposed to construct augmented dialogues where both post and response are retrieved from the unpaired data. A ranking module is employed to filter out low-quality dialogues. A model-level distillation process is employed to distill a teacher model trained on high-quality paired data to augmented dialogue pairs.
arXiv Detail & Related papers (2020-09-20T13:06:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.