MCP: Self-supervised Pre-training for Personalized Chatbots with
Multi-level Contrastive Sampling
- URL: http://arxiv.org/abs/2210.08753v2
- Date: Wed, 19 Oct 2022 15:34:38 GMT
- Title: MCP: Self-supervised Pre-training for Personalized Chatbots with
Multi-level Contrastive Sampling
- Authors: Zhaoheng Huang, Zhicheng Dou, Yutao Zhu and Zhengyi Ma
- Abstract summary: We propose a self-supervised learning framework for capturing better representations from users' dialogue history for personalized chatbots.
Specifically, we apply contrastive sampling methods to leverage the supervised signals hidden in user dialog history.
Experimental results on two real-world datasets show a significant improvement in our proposed model MCP compared with the existing methods.
- Score: 18.40883902610959
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Personalized chatbots focus on endowing the chatbots with a consistent
personality to behave like real users and further act as personal assistants.
Previous studies have explored generating implicit user profiles from the
user's dialogue history for building personalized chatbots. However, these
studies only use the response generation loss to train the entire model, thus
it is prone to suffer from the problem of data sparsity. Besides, they
overemphasize the final generated response's quality while ignoring the
correlations and fusions between the user's dialogue history, leading to rough
data representations and performance degradation. To tackle these problems, we
propose a self-supervised learning framework MCP for capturing better
representations from users' dialogue history for personalized chatbots.
Specifically, we apply contrastive sampling methods to leverage the supervised
signals hidden in user dialog history, and generate the pre-training samples
for enhancing the model. We design three pre-training tasks based on three
types of contrastive pairs from user dialogue history, namely response pairs,
sequence augmentation pairs, and user pairs. We pre-train the utterance encoder
and the history encoder towards the contrastive objectives and use these
pre-trained encoders for generating user profiles while personalized response
generation. Experimental results on two real-world datasets show a significant
improvement in our proposed model MCP compared with the existing methods.
Related papers
- User-Specific Dialogue Generation with User Profile-Aware Pre-Training Model and Parameter-Efficient Fine-Tuning [2.2859366462875794]
User-specific dialogue aims to reproduce real-user dialogue beyond persona-based dialogue.
Fine-tuning using the target user's dialogue history is an efficient learning method for a user-specific model.
We propose a learning method for user-specific models by combining parameter-efficient fine-tuning with a pre-trained dialogue model.
arXiv Detail & Related papers (2024-09-02T01:30:40Z) - RECAP: Retrieval-Enhanced Context-Aware Prefix Encoder for Personalized
Dialogue Response Generation [30.245143345565758]
We propose a new retrieval-enhanced approach for personalized response generation.
We design a hierarchical transformer retriever trained on dialogue domain data to perform personalized retrieval and a context-aware prefix encoder that fuses the retrieved information to the decoder more effectively.
We quantitatively evaluate our model's performance under a suite of human and automatic metrics and find it to be superior compared to state-of-the-art baselines on English Reddit conversations.
arXiv Detail & Related papers (2023-06-12T16:10:21Z) - Pre-training Multi-party Dialogue Models with Latent Discourse Inference [85.9683181507206]
We pre-train a model that understands the discourse structure of multi-party dialogues, namely, to whom each utterance is replying.
To fully utilize the unlabeled data, we propose to treat the discourse structures as latent variables, then jointly infer them and pre-train the discourse-aware model.
arXiv Detail & Related papers (2023-05-24T14:06:27Z) - A Model-Agnostic Data Manipulation Method for Persona-based Dialogue
Generation [107.82729587882397]
It is expensive to scale up current persona-based dialogue datasets.
Each data sample in this task is more complex to learn with than conventional dialogue data.
We propose a data manipulation method, which is model-agnostic to be packed with any persona-based dialogue generation model.
arXiv Detail & Related papers (2022-04-21T03:49:54Z) - Less is More: Learning to Refine Dialogue History for Personalized
Dialogue Generation [57.73547958927826]
We propose to refine the user dialogue history on a large scale, based on which we can handle more dialogue history and obtain more accurate persona information.
Specifically, we design an MSP model which consists of three personal information refiners and a personalized response generator.
arXiv Detail & Related papers (2022-04-18T02:02:56Z) - One Chatbot Per Person: Creating Personalized Chatbots based on Implicit
User Profiles [31.432585994256375]
Existing personalized approaches tried to incorporate several text descriptions as explicit user profiles.
We train a personalized language model to construct a general user profile from the user's historical responses.
We design a personalized decoder to fuse two decoding strategies, including generating a word from the generic vocabulary and copying one word from the user's personalized vocabulary.
arXiv Detail & Related papers (2021-08-20T20:33:12Z) - CloneBot: Personalized Dialogue-Response Predictions [0.0]
The project task was to create a model that, given a speaker ID, chat history, and an utterance query, can predict the response utterance in a conversation.
The model is personalized for each speaker. This task can be a useful tool for building speech bots that talk in a human-like manner in a live conversation.
arXiv Detail & Related papers (2021-03-31T01:15:37Z) - Dialogue History Matters! Personalized Response Selectionin Multi-turn
Retrieval-based Chatbots [62.295373408415365]
We propose a personalized hybrid matching network (PHMN) for context-response matching.
Our contributions are two-fold: 1) our model extracts personalized wording behaviors from user-specific dialogue history as extra matching information.
We evaluate our model on two large datasets with user identification, i.e., personalized dialogue Corpus Ubuntu (P- Ubuntu) and personalized Weibo dataset (P-Weibo)
arXiv Detail & Related papers (2021-03-17T09:42:11Z) - Exploiting Unsupervised Data for Emotion Recognition in Conversations [76.01690906995286]
Emotion Recognition in Conversations (ERC) aims to predict the emotional state of speakers in conversations.
The available supervised data for the ERC task is limited.
We propose a novel approach to leverage unsupervised conversation data.
arXiv Detail & Related papers (2020-10-02T13:28:47Z) - Pchatbot: A Large-Scale Dataset for Personalized Chatbot [49.16746174238548]
We introduce Pchatbot, a large-scale dialogue dataset that contains two subsets collected from Weibo and Judicial forums respectively.
To adapt the raw dataset to dialogue systems, we elaborately normalize the raw dataset via processes such as anonymization.
The scale of Pchatbot is significantly larger than existing Chinese datasets, which might benefit the data-driven models.
arXiv Detail & Related papers (2020-09-28T12:49:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.