Related papers: Dual Task Framework for Debiasing Persona-grounded Dialogue Dataset

Dual Task Framework for Debiasing Persona-grounded Dialogue Dataset

URL: http://arxiv.org/abs/2202.05435v1
Date: Fri, 11 Feb 2022 04:08:46 GMT
Title: Dual Task Framework for Debiasing Persona-grounded Dialogue Dataset
Authors: Minju Kim, Beong-woo Kwak, Youngwook Kim, Hong-in Lee, Seung-won Hwang, Jinyoung Yeo
Abstract summary: We introduce a data-centric approach for the task of improving persona-conditioned dialogue agents. Specifically, we augment relevant personas to improve dialogue dataset/agent, by leveraging the primal-dual structure of the two tasks. Experiments on Persona-Chat show that our approach outperforms pre-trained LMs by an 11.7 point gain in terms of accuracy.
Score: 17.403065663306567
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This paper introduces a simple yet effective data-centric approach for the task of improving persona-conditioned dialogue agents. Prior model-centric approaches unquestioningly depend on the raw crowdsourced benchmark datasets such as Persona-Chat. In contrast, we aim to fix annotation artifacts in benchmarking, which is orthogonally applicable to any dialogue model. Specifically, we augment relevant personas to improve dialogue dataset/agent, by leveraging the primal-dual structure of the two tasks, predicting dialogue responses and personas based on each other. Experiments on Persona-Chat show that our approach outperforms pre-trained LMs by an 11.7 point gain in terms of accuracy.

Related papers

Dialogue Language Model with Large-Scale Persona Data Engineering [10.160626284195434]
PPDS is an open-domain persona dialogue system that employs extensive generative pre-training on a persona dialogue dataset to enhance persona consistency. We present a persona extraction model designed to autonomously and precisely generate vast persona dialogue datasets. We also unveil a pioneering persona augmentation technique to address the invalid persona bias inherent in the constructed dataset.
arXiv Detail & Related papers (2024-12-12T07:49:06Z)
PersonalityChat: Conversation Distillation for Personalized Dialog Modeling with Facts and Traits [5.447308344436046]
PersonalityChat is a synthetic conversational dataset based upon the popular PersonaChat dataset. We show that the personality trait labels can be used for trait-based personalization of generative dialogue models.
arXiv Detail & Related papers (2024-01-14T20:35:33Z)
Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization [20.722098595079945]
We propose a model-agnostic framework, ORder Insensitive Generation (ORIG), to mitigate the order sensitivity problem. Experiments on the Persona-Chat dataset justify the effectiveness and superiority of our method.
arXiv Detail & Related papers (2023-05-22T07:24:29Z)
Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding [103.94325597273316]
We present a novel approach that iterates on augmentation quality by applying weakly-supervised filters. We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue. For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.
arXiv Detail & Related papers (2022-10-25T17:01:30Z)
DialogZoo: Large-Scale Dialog-Oriented Task Learning [52.18193690394549]
We aim to build a unified foundation model which can solve massive diverse dialogue tasks. To achieve this goal, we first collect a large-scale well-labeled dialogue dataset from 73 publicly available datasets.
arXiv Detail & Related papers (2022-05-25T11:17:16Z)
KETOD: Knowledge-Enriched Task-Oriented Dialogue [77.59814785157877]
Existing studies in dialogue system research mostly treat task-oriented dialogue and chit-chat as separate domains. We investigate how task-oriented dialogue and knowledge-grounded chit-chat can be effectively integrated into a single model.
arXiv Detail & Related papers (2022-05-11T16:01:03Z)
A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets. Each data sample in this task is more complex to learn with than conventional dialogue data. We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z)
Learning to Predict Persona Information forDialogue Personalization without Explicit Persona Description [10.17868476063421]
We propose a novel approach that learns to predict persona information based on the dialogue history to personalize the dialogue agent. Experimental results on the PersonaChat dataset show that the proposed method can improve the consistency of generated responses. A trained persona prediction model can be successfully transferred to other datasets and help generate more relevant responses.
arXiv Detail & Related papers (2021-11-30T03:19:24Z)
Partner Matters! An Empirical Study on Fusing Personas for Personalized Response Selection in Retrieval-Based Chatbots [51.091235903442715]
This paper makes an attempt to explore the impact of utilizing personas that describe either self or partner speakers on the task of response selection. Four persona fusion strategies are designed, which assume personas interact with contexts or responses in different ways. Empirical studies on the Persona-Chat dataset show that the partner personas can improve the accuracy of response selection.
arXiv Detail & Related papers (2021-05-19T10:32:30Z)
You Impress Me: Dialogue Generation via Mutual Persona Perception [62.89449096369027]
The research in cognitive science suggests that understanding is an essential signal for a high-quality chit-chat conversation. Motivated by this, we propose P2 Bot, a transmitter-receiver based framework with the aim of explicitly modeling understanding.
arXiv Detail & Related papers (2020-04-11T12:51:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.