Related papers: MCP: Self-supervised Pre-training for Personalized Chatbots with Multi-level Contrastive Sampling

MCP: Self-supervised Pre-training for Personalized Chatbots with Multi-level Contrastive Sampling

URL: http://arxiv.org/abs/2210.08753v2
Date: Wed, 19 Oct 2022 15:34:38 GMT
Title: MCP: Self-supervised Pre-training for Personalized Chatbots with Multi-level Contrastive Sampling
Authors: Zhaoheng Huang, Zhicheng Dou, Yutao Zhu and Zhengyi Ma
Abstract summary: We propose a self-supervised learning framework for capturing better representations from users' dialogue history for personalized chatbots. Specifically, we apply contrastive sampling methods to leverage the supervised signals hidden in user dialog history. Experimental results on two real-world datasets show a significant improvement in our proposed model MCP compared with the existing methods.
Score: 18.40883902610959
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Personalized chatbots focus on endowing the chatbots with a consistent personality to behave like real users and further act as personal assistants. Previous studies have explored generating implicit user profiles from the user's dialogue history for building personalized chatbots. However, these studies only use the response generation loss to train the entire model, thus it is prone to suffer from the problem of data sparsity. Besides, they overemphasize the final generated response's quality while ignoring the correlations and fusions between the user's dialogue history, leading to rough data representations and performance degradation. To tackle these problems, we propose a self-supervised learning framework MCP for capturing better representations from users' dialogue history for personalized chatbots. Specifically, we apply contrastive sampling methods to leverage the supervised signals hidden in user dialog history, and generate the pre-training samples for enhancing the model. We design three pre-training tasks based on three types of contrastive pairs from user dialogue history, namely response pairs, sequence augmentation pairs, and user pairs. We pre-train the utterance encoder and the history encoder towards the contrastive objectives and use these pre-trained encoders for generating user profiles while personalized response generation. Experimental results on two real-world datasets show a significant improvement in our proposed model MCP compared with the existing methods.

Related papers

REALTALK: A 21-Day Real-World Dataset for Long-Term Conversation [51.97224538045096]
We introduce REALTALK, a 21-day corpus of authentic messaging app dialogues. We compare EI attributes and persona consistency to understand the challenges posed by real-world dialogues. Our findings reveal that models struggle to simulate a user solely from dialogue history, while fine-tuning on specific user chats improves persona emulation.
arXiv Detail & Related papers (2025-02-18T20:29:01Z)
Doing Personal LAPS: LLM-Augmented Dialogue Construction for Personalized Multi-Session Conversational Search [9.243535345193711]
Our method uses large language models to guide a single human worker in generating personalized dialogues. LAPS can collect large-scale, human-written, multi-session, and multi-domain conversations. Our results show that responses generated explicitly using extracted preferences better match user's actual preferences.
arXiv Detail & Related papers (2024-05-06T13:53:03Z)
RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized Dialogue Response Generation [30.245143345565758]
We propose a new retrieval-enhanced approach for personalized response generation. We design a hierarchical transformer retriever trained on dialogue domain data to perform personalized retrieval and a context-aware prefix encoder that fuses the retrieved information to the decoder more effectively. We quantitatively evaluate our model's performance under a suite of human and automatic metrics and find it to be superior compared to state-of-the-art baselines on English Reddit conversations.
arXiv Detail & Related papers (2023-06-12T16:10:21Z)
Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying. To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z)
A Model-Agnostic Data Manipulation Method for Persona-based Dialogue Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets. Each data sample in this task is more complex to learn with than conventional dialogue data. We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z)
Less is More: Learning to Refine Dialogue History for Personalized Dialogue Generation [57.73547958927826]
We propose to refine the user dialogue history on a large scale, based on which we can handle more dialogue history and obtain more accurate persona information. Specifically, we design an MSP model which consists of three personal information refiners and a personalized response generator.
arXiv Detail & Related papers (2022-04-18T02:02:56Z)
One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles [31.432585994256375]
Existing personalized approaches tried to incorporate several text descriptions as explicit user profiles. We train a personalized language model to construct a general user profile from the user's historical responses. We design a personalized decoder to fuse two decoding strategies, including generating a word from the generic vocabulary and copying one word from the user's personalized vocabulary.
arXiv Detail & Related papers (2021-08-20T20:33:12Z)
CloneBot: Personalized Dialogue-Response Predictions [0.0]
The project task was to create a model that, given a speaker ID, chat history, and an utterance query, can predict the response utterance in a conversation. The model is personalized for each speaker. This task can be a useful tool for building speech bots that talk in a human-like manner in a live conversation.
arXiv Detail & Related papers (2021-03-31T01:15:37Z)
Dialogue History Matters! Personalized Response Selectionin Multi-turn Retrieval-based Chatbots [62.295373408415365]
We propose a personalized hybrid matching network (PHMN) for context-response matching. Our contributions are two-fold: 1) our model extracts personalized wording behaviors from user-specific dialogue history as extra matching information. We evaluate our model on two large datasets with user identification, i.e., personalized dialogue Corpus Ubuntu (P- Ubuntu) and personalized Weibo dataset (P-Weibo)
arXiv Detail & Related papers (2021-03-17T09:42:11Z)
Exploiting Unsupervised Data for Emotion Recognition in Conversations [76.01690906995286]
Emotion Recognition in Conversations (ERC) aims to predict the emotional state of speakers in conversations. The available supervised data for the ERC task is limited. We propose a novel approach to leverage unsupervised conversation data.
arXiv Detail & Related papers (2020-10-02T13:28:47Z)
Pchatbot: A Large-Scale Dataset for Personalized Chatbot [49.16746174238548]
We introduce Pchatbot, a large-scale dialogue dataset that contains two subsets collected from Weibo and Judicial forums respectively. To adapt the raw dataset to dialogue systems, we elaborately normalize the raw dataset via processes such as anonymization. The scale of Pchatbot is significantly larger than existing Chinese datasets, which might benefit the data-driven models.
arXiv Detail & Related papers (2020-09-28T12:49:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.