Dialogue Language Model with Large-Scale Persona Data Engineering
- URL: http://arxiv.org/abs/2412.09034v2
- Date: Wed, 19 Feb 2025 15:08:06 GMT
- Title: Dialogue Language Model with Large-Scale Persona Data Engineering
- Authors: Mengze Hong, Chen Jason Zhang, Chaotao Chen, Rongzhong Lian, Di Jiang,
- Abstract summary: PPDS is an open-domain persona dialogue system that employs extensive generative pre-training on a persona dialogue dataset to enhance persona consistency.
We present a persona extraction model designed to autonomously and precisely generate vast persona dialogue datasets.
We also unveil a pioneering persona augmentation technique to address the invalid persona bias inherent in the constructed dataset.
- Score: 10.160626284195434
- License:
- Abstract: Maintaining persona consistency is paramount in the application of open-domain dialogue systems, as exemplified by models like ChatGPT. Despite significant advancements, the limited scale and diversity of current persona dialogue datasets remain challenges to achieving robust persona-consistent dialogue models. In this study, drawing inspiration from the success of large-scale pre-training, we introduce PPDS, an open-domain persona dialogue system that employs extensive generative pre-training on a persona dialogue dataset to enhance persona consistency. Specifically, we present a persona extraction model designed to autonomously and precisely generate vast persona dialogue datasets. Additionally, we unveil a pioneering persona augmentation technique to address the invalid persona bias inherent in the constructed dataset. Both quantitative and human evaluations consistently highlight the superior response quality and persona consistency of our proposed model, underscoring its effectiveness.
Related papers
- REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation [51.97224538045096]
We introduce REALTALK, a 21-day corpus of authentic messaging app dialogues.
We compare EI attributes and persona consistency to understand the challenges posed by real-world dialogues.
Our findings reveal that models struggle to simulate a user solely from dialogue history, while fine-tuning on specific user chats improves persona emulation.
arXiv Detail & Related papers (2025-02-18T20:29:01Z) - "In Dialogues We Learn": Towards Personalized Dialogue Without Pre-defined Profiles through In-Dialogue Learning [37.307408706864514]
In-Dialogue Learning (IDL) is a fine-tuning framework that enhances the ability of pre-trained large language models to leverage dialogue history to characterize persona.
Our experiments on three datasets demonstrate that IDL brings substantial improvements, with BLEU and ROUGE scores increasing by up to 200% and 247%, respectively.
arXiv Detail & Related papers (2024-03-05T16:43:03Z) - WHAT, WHEN, and HOW to Ground: Designing User Persona-Aware
Conversational Agents for Engaging Dialogue [4.328280329592151]
We present a method for building a personalized open-domain dialogue system to address the WWH problem for natural response generation in a commercial setting.
The proposed approach involves weighted dataset blending, negative persona information augmentation methods, and the design of personalized conversation datasets.
Our work effectively balances dialogue fluency and tendency to ground, while also introducing a response-type label to improve the controllability and explainability of the grounded responses.
arXiv Detail & Related papers (2023-06-06T02:28:38Z) - GODEL: Large-Scale Pre-Training for Goal-Directed Dialog [119.1397031992088]
We introduce GODEL, a large pre-trained language model for dialog.
We show that GODEL outperforms state-of-the-art pre-trained dialog models in few-shot fine-tuning setups.
A novel feature of our evaluation methodology is the introduction of a notion of utility that assesses the usefulness of responses.
arXiv Detail & Related papers (2022-06-22T18:19:32Z) - A Model-Agnostic Data Manipulation Method for Persona-based Dialogue
Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets.
Each data sample in this task is more complex to learn with than conventional dialogue data.
We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z) - Less is More: Learning to Refine Dialogue History for Personalized
Dialogue Generation [57.73547958927826]
We propose to refine the user dialogue history on a large scale, based on which we can handle more dialogue history and obtain more accurate persona information.
Specifically, we design an MSP model which consists of three personal information refiners and a personalized response generator.
arXiv Detail & Related papers (2022-04-18T02:02:56Z) - Learning to Predict Persona Information forDialogue Personalization
without Explicit Persona Description [10.17868476063421]
We propose a novel approach that learns to predict persona information based on the dialogue history to personalize the dialogue agent.
Experimental results on the PersonaChat dataset show that the proposed method can improve the consistency of generated responses.
A trained persona prediction model can be successfully transferred to other datasets and help generate more relevant responses.
arXiv Detail & Related papers (2021-11-30T03:19:24Z) - DLVGen: A Dual Latent Variable Approach to Personalized Dialogue
Generation [28.721411816698563]
We propose a Dual Latent Variable Generator (DLVGen) capable of generating personalized dialogue.
Unlike prior work, DLVGen models the latent distribution over potential responses as well as the latent distribution over the agent's potential persona.
Empirical results show that DLVGen is capable of generating diverse responses which accurately incorporate the agent's persona.
arXiv Detail & Related papers (2021-11-22T17:21:21Z) - Modeling Long Context for Task-Oriented Dialogue State Generation [51.044300192906995]
We propose a multi-task learning model with a simple yet effective utterance tagging technique and a bidirectional language model.
Our approaches attempt to solve the problem that the performance of the baseline significantly drops when the input dialogue context sequence is long.
In our experiments, our proposed model achieves a 7.03% relative improvement over the baseline, establishing a new state-of-the-art joint goal accuracy of 52.04% on the MultiWOZ 2.0 dataset.
arXiv Detail & Related papers (2020-04-29T11:02:25Z) - You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation.
Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.