Related papers: StyleDGPT: Stylized Response Generation with Pre-trained Language Models

StyleDGPT: Stylized Response Generation with Pre-trained Language Models

URL: http://arxiv.org/abs/2010.02569v1
Date: Tue, 6 Oct 2020 09:29:50 GMT
Title: StyleDGPT: Stylized Response Generation with Pre-trained Language Models
Authors: Ze Yang, Wei Wu, Can Xu, Xinnian Liang, Jiaqi Bai, Liran Wang, Wei Wang, Zhoujun Li
Abstract summary: We introduce a KL loss and a style classifier to steer response generation towards the target style in both a word-level and a sentence-level. Our model can significantly outperform state-of-the-art methods in terms of both style consistency and contextual coherence.
Score: 39.526613595499356
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generating responses following a desired style has great potentials to extend applications of open-domain dialogue systems, yet is refrained by lacking of parallel data for training. In this work, we explore the challenging task with pre-trained language models that have brought breakthrough to various natural language tasks. To this end, we introduce a KL loss and a style classifier to the fine-tuning step in order to steer response generation towards the target style in both a word-level and a sentence-level. Comprehensive empirical studies with two public datasets indicate that our model can significantly outperform state-of-the-art methods in terms of both style consistency and contextual coherence.

Related papers

Scalable Language Model with Generalized Continual Learning [58.700439919096155]
The Joint Adaptive Re-ization (JARe) is integrated with Dynamic Task-related Knowledge Retrieval (DTKR) to enable adaptive adjustment of language models based on specific downstream tasks. Our method demonstrates state-of-the-art performance on diverse backbones and benchmarks, achieving effective continual learning in both full-set and few-shot scenarios with minimal forgetting.
arXiv Detail & Related papers (2024-04-11T04:22:15Z)
SLM: Learning a Discourse Language Representation with Sentence Unshuffling [53.42814722621715]
We introduce Sentence-level Language Modeling, a new pre-training objective for learning a discourse language representation. We show that this feature of our model improves the performance of the original BERT by large margins.
arXiv Detail & Related papers (2020-10-30T13:33:41Z)
Incorporating Stylistic Lexical Preferences in Generative Language Models [10.62343151429147]
We present an approach to induce certain target-author attributes by incorporating continuous multi-dimensional lexical preferences of an author into generative language models. Our experiments demonstrate that the proposed approach can generate text that distinctively aligns with a given target author's lexical style.
arXiv Detail & Related papers (2020-10-22T09:24:05Z)
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models [74.09352261943911]
We study knowledge-grounded dialogue generation with pre-trained language models. We propose equipping response generation defined by a pre-trained language model with a knowledge selection module.
arXiv Detail & Related papers (2020-10-17T16:49:43Z)
Semi-supervised Formality Style Transfer using Language Model Discriminator and Mutual Information Maximization [52.867459839641526]
Formality style transfer is the task of converting informal sentences to grammatically-correct formal sentences. We propose a semi-supervised formality style transfer model that utilizes a language model-based discriminator to maximize the likelihood of the output sentence being formal. Experiments showed that our model outperformed previous state-of-the-art baselines significantly in terms of both automated metrics and human judgement.
arXiv Detail & Related papers (2020-10-10T21:05:56Z)
Prototype-to-Style: Dialogue Generation with Style-Aware Editing on Retrieval Memory [65.98002918470543]
We introduce a new prototype-to-style framework to tackle the challenge of stylistic dialogue generation. The framework uses an Information Retrieval (IR) system and extracts a response prototype from the retrieved response. A stylistic response generator then takes the prototype and the desired language style as model input to obtain a high-quality and stylistic response.
arXiv Detail & Related papers (2020-04-05T14:36:15Z)
An Empirical Investigation of Pre-Trained Transformer Language Models for Open-Domain Dialogue Generation [23.343006562849126]
We present an empirical investigation of pre-trained Transformer-based auto-regressive language models for the task of open-domain dialogue generation. Training paradigm of pre-training and fine-tuning is employed to conduct learning. Experiments are conducted on the typical single-turn and multi-turn dialogue corpora such as Weibo, Douban, Reddit, DailyDialog, and Persona-Chat.
arXiv Detail & Related papers (2020-03-09T15:20:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.