Privacy Amplification via Random Participation in Federated Learning
- URL: http://arxiv.org/abs/2205.01556v1
- Date: Tue, 3 May 2022 15:11:34 GMT
- Title: Privacy Amplification via Random Participation in Federated Learning
- Authors: Burak Hasircioglu and Deniz Gunduz
- Abstract summary: In a federated setting, we consider random participation of the clients in addition to subsampling their local datasets.
We show that when the size of the local datasets is small, the privacy guarantees via random participation is close to those of the centralized setting.
- Score: 3.8580784887142774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Running a randomized algorithm on a subsampled dataset instead of the entire
dataset amplifies differential privacy guarantees. In this work, in a federated
setting, we consider random participation of the clients in addition to
subsampling their local datasets. Since such random participation of the
clients creates correlation among the samples of the same client in their
subsampling, we analyze the corresponding privacy amplification via non-uniform
subsampling. We show that when the size of the local datasets is small, the
privacy guarantees via random participation is close to those of the
centralized setting, in which the entire dataset is located in a single host
and subsampled. On the other hand, when the local datasets are large, observing
the output of the algorithm may disclose the identities of the sampled clients
with high confidence. Our analysis reveals that, even in this case, privacy
guarantees via random participation outperform those via only local
subsampling.
Related papers
- Leveraging Randomness in Model and Data Partitioning for Privacy Amplification [8.52745154080651]
We study how inherent randomness in the training process can be leveraged for privacy amplification.
This includes (1) data partitioning, where a sample participates in only a subset of training iterations, and (2) model partitioning, where a sample updates only a subset of the model parameters.
Our results demonstrate that randomness in the training process, which is structured rather than i.i.d. and interacts with data in complex ways, can be systematically leveraged for significant privacy amplification.
arXiv Detail & Related papers (2025-03-04T22:49:59Z) - A collaborative ensemble construction method for federated random forest [3.245822581039027]
This paper presents a federated random forest approach that employs a novel ensemble construction method aimed at improving performance under non-IID data.
To preserve the privacy of the client's data, we confine the information stored in the leaf nodes to the majority class label identified from the samples of the client's local data that reach each node.
arXiv Detail & Related papers (2024-07-27T07:21:45Z) - Enhanced Privacy Bound for Shuffle Model with Personalized Privacy [32.08637708405314]
Differential Privacy (DP) is an enhanced privacy protocol which introduces an intermediate trusted server between local users and a central data curator.
It significantly amplifies the central DP guarantee by anonymizing and shuffling the local randomized data.
This work focuses on deriving the central privacy bound for a more practical setting where personalized local privacy is required by each user.
arXiv Detail & Related papers (2024-07-25T16:11:56Z) - Personalized Privacy Amplification via Importance Sampling [3.0636509793595548]
In this paper, we examine the privacy properties of importance sampling, focusing on an individualized privacy analysis.
We find that, in importance sampling, privacy is well aligned with utility but at odds with sample size.
We propose two approaches for constructing sampling distributions: one that optimize the privacy-efficiency trade-off; and one based on a utility guarantee in the form of coresets.
arXiv Detail & Related papers (2023-07-05T17:09:10Z) - FedSampling: A Better Sampling Strategy for Federated Learning [81.85411484302952]
Federated learning (FL) is an important technique for learning models from decentralized data in a privacy-preserving way.
Existing FL methods usually uniformly sample clients for local model learning in each round.
We propose a novel data uniform sampling strategy for federated learning (FedSampling)
arXiv Detail & Related papers (2023-06-25T13:38:51Z) - Efficient Distribution Similarity Identification in Clustered Federated
Learning via Principal Angles Between Client Data Subspaces [59.33965805898736]
Clustered learning has been shown to produce promising results by grouping clients into clusters.
Existing FL algorithms are essentially trying to group clients together with similar distributions.
Prior FL algorithms attempt similarities indirectly during training.
arXiv Detail & Related papers (2022-09-21T17:37:54Z) - Releasing survey microdata with exact cluster locations and additional
privacy safeguards [77.34726150561087]
We propose an alternative microdata dissemination strategy that leverages the utility of the original microdata with additional privacy safeguards.
Our strategy reduces the respondents' re-identification risk for any number of disclosed attributes by 60-80% even under re-identification attempts.
arXiv Detail & Related papers (2022-05-24T19:37:11Z) - Renyi Differential Privacy of the Subsampled Shuffle Model in
Distributed Learning [7.197592390105457]
We study privacy in a distributed learning framework, where clients collaboratively build a learning model iteratively through interactions with a server from whom we need privacy.
Motivated by optimization and the federated learning (FL) paradigm, we focus on the case where a small fraction of data samples are randomly sub-sampled in each round.
To obtain even stronger local privacy guarantees, we study this in the shuffle privacy model, where each client randomizes its response using a local differentially private (LDP) mechanism.
arXiv Detail & Related papers (2021-07-19T11:43:24Z) - FedMix: Approximation of Mixup under Mean Augmented Federated Learning [60.503258658382]
Federated learning (FL) allows edge devices to collectively learn a model without directly sharing data within each device.
Current state-of-the-art algorithms suffer from performance degradation as the heterogeneity of local data across clients increases.
We propose a new augmentation algorithm, named FedMix, which is inspired by a phenomenal yet simple data augmentation method, Mixup.
arXiv Detail & Related papers (2021-07-01T06:14:51Z) - Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy
Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data.
Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z) - Oblivious Sampling Algorithms for Private Data Analysis [10.990447273771592]
We study secure and privacy-preserving data analysis based on queries executed on samples from a dataset.
Trusted execution environments (TEEs) can be used to protect the content of the data during query computation.
Supporting differential-private (DP) queries in TEEs provides record privacy when query output is revealed.
arXiv Detail & Related papers (2020-09-28T23:45:30Z) - Privacy Amplification via Random Check-Ins [38.72327434015975]
Differentially Private Gradient Descent (DP-SGD) forms a fundamental building block in many applications for learning over sensitive data.
In this paper, we focus on conducting iterative methods like DP-SGD in the setting of federated learning (FL) wherein the data is distributed among many devices (clients)
Our main contribution is the emphrandom check-in distributed protocol, which crucially relies only on randomized participation decisions made locally and independently by each client.
arXiv Detail & Related papers (2020-07-13T18:14:09Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.