Decepticons: Corrupted Transformers Breach Privacy in Federated Learning
for Language Models
- URL: http://arxiv.org/abs/2201.12675v2
- Date: Wed, 31 May 2023 16:05:45 GMT
- Title: Decepticons: Corrupted Transformers Breach Privacy in Federated Learning
for Language Models
- Authors: Liam Fowl, Jonas Geiping, Steven Reich, Yuxin Wen, Wojtek Czaja, Micah
Goldblum, Tom Goldstein
- Abstract summary: We propose a novel attack that reveals private user text by deploying malicious parameter vectors.
Unlike previous attacks on FL, the attack exploits characteristics of both the Transformer architecture and the token embedding.
- Score: 58.631918656336005
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A central tenet of Federated learning (FL), which trains models without
centralizing user data, is privacy. However, previous work has shown that the
gradient updates used in FL can leak user information. While the most
industrial uses of FL are for text applications (e.g. keystroke prediction),
nearly all attacks on FL privacy have focused on simple image classifiers. We
propose a novel attack that reveals private user text by deploying malicious
parameter vectors, and which succeeds even with mini-batches, multiple users,
and long sequences. Unlike previous attacks on FL, the attack exploits
characteristics of both the Transformer architecture and the token embedding,
separately extracting tokens and positional embeddings to retrieve
high-fidelity text. This work suggests that FL on text, which has historically
been resistant to privacy attacks, is far more vulnerable than previously
thought.
Related papers
- DMM: Distributed Matrix Mechanism for Differentially-Private Federated Learning using Packed Secret Sharing [51.336015600778396]
Federated Learning (FL) has gained lots of traction recently, both in industry and academia.
In FL, a machine learning model is trained using data from various end-users arranged in committees across several rounds.
Since such data can often be sensitive, a primary challenge in FL is providing privacy while still retaining utility of the model.
arXiv Detail & Related papers (2024-10-21T16:25:14Z) - Privacy Attack in Federated Learning is Not Easy: An Experimental Study [5.065947993017158]
Federated learning (FL) is an emerging distributed machine learning paradigm proposed for privacy preservation.
Recent studies have indicated that FL cannot entirely guarantee privacy protection.
It remains uncertain whether privacy attack FL algorithms are effective in realistic federated environments.
arXiv Detail & Related papers (2024-09-28T10:06:34Z) - Byzantine-Resilient Secure Aggregation for Federated Learning Without Privacy Compromises [4.242342898338019]
Federated learning (FL) shows great promise in large scale machine learning, but brings new risks in terms of privacy and security.
We propose ByITFL, a novel scheme for FL that provides resilience against Byzantine users while keeping the users' data private from the federator and private from other users.
arXiv Detail & Related papers (2024-05-14T15:37:56Z) - SaFL: Sybil-aware Federated Learning with Application to Face
Recognition [13.914187113334222]
Federated Learning (FL) is a machine learning paradigm to conduct collaborative learning among clients on a joint model.
On the downside, FL raises security and privacy concerns that have just started to be studied.
This paper proposes a new defense method against poisoning attacks in FL called SaFL.
arXiv Detail & Related papers (2023-11-07T21:06:06Z) - FLTrojan: Privacy Leakage Attacks against Federated Language Models Through Selective Weight Tampering [2.2194815687410627]
We show how a malicious client can leak the privacy-sensitive data of some other users in FL even without any cooperation from the server.
Our best-performing method improves the membership inference recall by 29% and achieves up to 71% private data reconstruction.
arXiv Detail & Related papers (2023-10-24T19:50:01Z) - FheFL: Fully Homomorphic Encryption Friendly Privacy-Preserving Federated Learning with Byzantine Users [19.209830150036254]
federated learning (FL) technique was developed to mitigate data privacy issues in the traditional machine learning paradigm.
Next-generation FL architectures proposed encryption and anonymization techniques to protect the model updates from the server.
This paper proposes a novel FL algorithm based on a fully homomorphic encryption (FHE) scheme.
arXiv Detail & Related papers (2023-06-08T11:20:00Z) - Federated Nearest Neighbor Machine Translation [66.8765098651988]
In this paper, we propose a novel federated nearest neighbor (FedNN) machine translation framework.
FedNN leverages one-round memorization-based interaction to share knowledge across different clients.
Experiments show that FedNN significantly reduces computational and communication costs compared with FedAvg.
arXiv Detail & Related papers (2023-02-23T18:04:07Z) - Do Gradient Inversion Attacks Make Federated Learning Unsafe? [70.0231254112197]
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.
Recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data.
In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack.
arXiv Detail & Related papers (2022-02-14T18:33:12Z) - Fishing for User Data in Large-Batch Federated Learning via Gradient
Magnification [65.33308059737506]
Federated learning (FL) has rapidly risen in popularity due to its promise of privacy and efficiency.
Previous works have exposed privacy vulnerabilities in the FL pipeline by recovering user data from gradient updates.
We introduce a new strategy that dramatically elevates existing attacks to operate on batches of arbitrarily large size.
arXiv Detail & Related papers (2022-02-01T17:26:11Z) - Understanding Clipping for Federated Learning: Convergence and
Client-Level Differential Privacy [67.4471689755097]
This paper empirically demonstrates that the clipped FedAvg can perform surprisingly well even with substantial data heterogeneity.
We provide the convergence analysis of a differential private (DP) FedAvg algorithm and highlight the relationship between clipping bias and the distribution of the clients' updates.
arXiv Detail & Related papers (2021-06-25T14:47:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.