You Don't Know My Favorite Color: Preventing Dialogue Representations
from Revealing Speakers' Private Personas
- URL: http://arxiv.org/abs/2205.10228v1
- Date: Tue, 26 Apr 2022 09:36:18 GMT
- Title: You Don't Know My Favorite Color: Preventing Dialogue Representations
from Revealing Speakers' Private Personas
- Authors: Haoran Li, Yangqiu Song and Lixin Fan
- Abstract summary: We show that speakers' personas can be inferred through a simple neural network with high accuracy.
We conduct extensive experiments to demonstrate that our proposed defense objectives can greatly reduce the attack accuracy from 37.6% to 0.5%.
- Score: 44.82330540456883
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Social chatbots, also known as chit-chat chatbots, evolve rapidly with large
pretrained language models. Despite the huge progress, privacy concerns have
arisen recently: training data of large language models can be extracted via
model inversion attacks. On the other hand, the datasets used for training
chatbots contain many private conversations between two individuals. In this
work, we further investigate the privacy leakage of the hidden states of
chatbots trained by language modeling which has not been well studied yet. We
show that speakers' personas can be inferred through a simple neural network
with high accuracy. To this end, we propose effective defense objectives to
protect persona leakage from hidden states. We conduct extensive experiments to
demonstrate that our proposed defense objectives can greatly reduce the attack
accuracy from 37.6% to 0.5%. Meanwhile, the proposed objectives preserve
language models' powerful generation ability.
Related papers
- Remote Timing Attacks on Efficient Language Model Inference [63.79839291641793]
We show it is possible to exploit timing differences to mount a timing attack.
We show how it is possible to learn the topic of a user's conversation with 90%+ precision.
An adversary can leverage a boosting attack to recover PII placed in messages for open source systems.
arXiv Detail & Related papers (2024-10-22T16:51:36Z) - Locally Differentially Private Document Generation Using Zero Shot
Prompting [61.20953109732442]
We propose a locally differentially private mechanism called DP-Prompt to counter author de-anonymization attacks.
When DP-Prompt is used with a powerful language model like ChatGPT (gpt-3.5), we observe a notable reduction in the success rate of de-anonymization attacks.
arXiv Detail & Related papers (2023-10-24T18:25:13Z) - FedBot: Enhancing Privacy in Chatbots with Federated Learning [0.0]
Federated Learning (FL) aims to protect data privacy through distributed learning methods that keep the data in its location.
The POC combines Deep Bidirectional Transformer models and federated learning algorithms to protect customer data privacy during collaborative model training.
The system is specifically designed to improve its performance and accuracy over time by leveraging its ability to learn from previous interactions.
arXiv Detail & Related papers (2023-04-04T23:13:52Z) - Rewarding Chatbots for Real-World Engagement with Millions of Users [1.2583983802175422]
This work investigates the development of social chatbots that prioritize user engagement to enhance retention.
The proposed approach uses automatic pseudo-labels collected from user interactions to train a reward model that can be used to reject low-scoring sample responses.
A/B testing on groups of 10,000 new dailychat users on the Chai Research platform shows that this approach increases the MCL by up to 70%.
Future work aims to use the reward model to realise a data fly-wheel, where the latest user conversations can be used to alternately fine-tune the language model and the reward model.
arXiv Detail & Related papers (2023-03-10T18:53:52Z) - Training Conversational Agents with Generative Conversational Networks [74.9941330874663]
We use Generative Conversational Networks to automatically generate data and train social conversational agents.
We evaluate our approach on TopicalChat with automatic metrics and human evaluators, showing that with 10% of seed data it performs close to the baseline that uses 100% of the data.
arXiv Detail & Related papers (2021-10-15T21:46:39Z) - Sm{\aa}prat: DialoGPT for Natural Language Generation of Swedish
Dialogue by Transfer Learning [1.6111818380407035]
State-of-the-art models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.
This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language.
arXiv Detail & Related papers (2021-10-12T18:46:43Z) - Learning Language-Conditioned Robot Behavior from Offline Data and
Crowd-Sourced Annotation [80.29069988090912]
We study the problem of learning a range of vision-based manipulation tasks from a large offline dataset of robot interaction.
We propose to leverage offline robot datasets with crowd-sourced natural language labels.
We find that our approach outperforms both goal-image specifications and language conditioned imitation techniques by more than 25%.
arXiv Detail & Related papers (2021-09-02T17:42:13Z) - A Multi-input Multi-output Transformer-based Hybrid Neural Network for
Multi-class Privacy Disclosure Detection [3.04585143845864]
In this paper, we propose a multi-input, multi-output hybrid neural network which utilizes transfer-learning, linguistics, and metadata to learn the hidden patterns.
We trained and evaluated our model on a human-annotated ground truth dataset, containing a total of 5,400 tweets.
arXiv Detail & Related papers (2021-08-19T03:58:49Z) - CloneBot: Personalized Dialogue-Response Predictions [0.0]
The project task was to create a model that, given a speaker ID, chat history, and an utterance query, can predict the response utterance in a conversation.
The model is personalized for each speaker. This task can be a useful tool for building speech bots that talk in a human-like manner in a live conversation.
arXiv Detail & Related papers (2021-03-31T01:15:37Z) - Pchatbot: A Large-Scale Dataset for Personalized Chatbot [49.16746174238548]
We introduce Pchatbot, a large-scale dialogue dataset that contains two subsets collected from Weibo and Judicial forums respectively.
To adapt the raw dataset to dialogue systems, we elaborately normalize the raw dataset via processes such as anonymization.
The scale of Pchatbot is significantly larger than existing Chinese datasets, which might benefit the data-driven models.
arXiv Detail & Related papers (2020-09-28T12:49:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.