FLAME: Differentially Private Federated Learning in the Shuffle Model
- URL: http://arxiv.org/abs/2009.08063v4
- Date: Sat, 20 Mar 2021 09:05:39 GMT
- Title: FLAME: Differentially Private Federated Learning in the Shuffle Model
- Authors: Ruixuan Liu, Yang Cao, Hong Chen, Ruoyang Guo, Masatoshi Yoshikawa
- Abstract summary: Federated Learning (FL) is a promising machine learning paradigm that enables the analyzer to train a model without collecting users' raw data.
We propose an FL framework in the shuffle model and a simple protocol (SS-Simple) extended from existing work.
We find that SS-Simple only provides an insufficient privacy amplification effect in FL since the dimension of the model parameter is quite large.
For boosting the utility when the model size is greater than the user population, we propose an advanced protocol (SS-Topk) with gradient sparsification techniques.
- Score: 25.244726600260748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is a promising machine learning paradigm that enables
the analyzer to train a model without collecting users' raw data. To ensure
users' privacy, differentially private federated learning has been intensively
studied. The existing works are mainly based on the \textit{curator model} or
\textit{local model} of differential privacy. However, both of them have pros
and cons. The curator model allows greater accuracy but requires a trusted
analyzer. In the local model where users randomize local data before sending
them to the analyzer, a trusted analyzer is not required but the accuracy is
limited. In this work, by leveraging the \textit{privacy amplification} effect
in the recently proposed shuffle model of differential privacy, we achieve the
best of two worlds, i.e., accuracy in the curator model and strong privacy
without relying on any trusted party. We first propose an FL framework in the
shuffle model and a simple protocol (SS-Simple) extended from existing work. We
find that SS-Simple only provides an insufficient privacy amplification effect
in FL since the dimension of the model parameter is quite large. To solve this
challenge, we propose an enhanced protocol (SS-Double) to increase the privacy
amplification effect by subsampling. Furthermore, for boosting the utility when
the model size is greater than the user population, we propose an advanced
protocol (SS-Topk) with gradient sparsification techniques. We also provide
theoretical analysis and numerical evaluations of the privacy amplification of
the proposed protocols. Experiments on real-world dataset validate that SS-Topk
improves the testing accuracy by 60.7\% than the local model based FL.
Related papers
- DMM: Distributed Matrix Mechanism for Differentially-Private Federated Learning using Packed Secret Sharing [51.336015600778396]
Federated Learning (FL) has gained lots of traction recently, both in industry and academia.
In FL, a machine learning model is trained using data from various end-users arranged in committees across several rounds.
Since such data can often be sensitive, a primary challenge in FL is providing privacy while still retaining utility of the model.
arXiv Detail & Related papers (2024-10-21T16:25:14Z) - PriRoAgg: Achieving Robust Model Aggregation with Minimum Privacy Leakage for Federated Learning [49.916365792036636]
Federated learning (FL) has recently gained significant momentum due to its potential to leverage large-scale distributed user data.
The transmitted model updates can potentially leak sensitive user information, and the lack of central control of the local training process leaves the global model susceptible to malicious manipulations on model updates.
We develop a general framework PriRoAgg, utilizing Lagrange coded computing and distributed zero-knowledge proof, to execute a wide range of robust aggregation algorithms while satisfying aggregated privacy.
arXiv Detail & Related papers (2024-07-12T03:18:08Z) - Beyond Statistical Estimation: Differentially Private Individual Computation via Shuffling [21.031062710893867]
This paper introduces a novel paradigm termed Private Individual Computation (PIC)
PIC enables personalized outputs while preserving privacy, and enjoys privacy amplification through shuffling.
We present an optimal randomizer, the Minkowski Response, designed for the PIC model to enhance utility.
arXiv Detail & Related papers (2024-06-26T07:53:48Z) - PRIOR: Personalized Prior for Reactivating the Information Overlooked in
Federated Learning [16.344719695572586]
We propose a novel scheme to inject personalized prior knowledge into a global model in each client.
At the heart of our proposed approach is a framework, the PFL with Bregman Divergence (pFedBreD)
Our method reaches the state-of-the-art performances on 5 datasets and outperforms other methods by up to 3.5% across 8 benchmarks.
arXiv Detail & Related papers (2023-10-13T15:21:25Z) - Can Public Large Language Models Help Private Cross-device Federated Learning? [58.05449579773249]
We study (differentially) private federated learning (FL) of language models.
Public data has been used to improve privacy-utility trade-offs for both large and small language models.
We propose a novel distribution matching algorithm with theoretical grounding to sample public data close to private data distribution.
arXiv Detail & Related papers (2023-05-20T07:55:58Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - Renyi Differential Privacy of the Subsampled Shuffle Model in
Distributed Learning [7.197592390105457]
We study privacy in a distributed learning framework, where clients collaboratively build a learning model iteratively through interactions with a server from whom we need privacy.
Motivated by optimization and the federated learning (FL) paradigm, we focus on the case where a small fraction of data samples are randomly sub-sampled in each round.
To obtain even stronger local privacy guarantees, we study this in the shuffle privacy model, where each client randomizes its response using a local differentially private (LDP) mechanism.
arXiv Detail & Related papers (2021-07-19T11:43:24Z) - Federated Learning with Sparsification-Amplified Privacy and Adaptive
Optimization [27.243322019117144]
Federated learning (FL) enables distributed agents to collaboratively learn a centralized model without sharing their raw data with each other.
We propose a new FL framework with sparsification-amplified privacy.
Our approach integrates random sparsification with gradient perturbation on each agent to amplify privacy guarantee.
arXiv Detail & Related papers (2020-08-01T20:22:57Z) - Privacy Amplification via Random Check-Ins [38.72327434015975]
Differentially Private Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data.
In this paper, we focus on conducting iterative methods like DP-SGD in the setting of federated learning (FL) wherein the data is distributed among many devices (clients)
Our main contribution is the emphrandom check-in distributed protocol, which crucially relies only on randomized participation decisions made locally and independently by each client.
arXiv Detail & Related papers (2020-07-13T18:14:09Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.