Learning from End User Data with Shuffled Differential Privacy over Kernel Densities
- URL: http://arxiv.org/abs/2502.14087v1
- Date: Wed, 19 Feb 2025 20:27:01 GMT
- Title: Learning from End User Data with Shuffled Differential Privacy over Kernel Densities
- Authors: Tal Wagner,
- Abstract summary: We study a setting of collecting and learning from private data distributed across end users.
In the shuffled model of differential privacy, the end users partially protect their data locally before sharing it, and their data is also anonymized during its collection to enhance privacy.
Our main technical result is a shuffled DP protocol for privately estimating the kernel density function of a distributed dataset.
- Score: 10.797515094684318
- License:
- Abstract: We study a setting of collecting and learning from private data distributed across end users. In the shuffled model of differential privacy, the end users partially protect their data locally before sharing it, and their data is also anonymized during its collection to enhance privacy. This model has recently become a prominent alternative to central DP, which requires full trust in a central data curator, and local DP, where fully local data protection takes a steep toll on downstream accuracy. Our main technical result is a shuffled DP protocol for privately estimating the kernel density function of a distributed dataset, with accuracy essentially matching central DP. We use it to privately learn a classifier from the end user data, by learning a private density function per class. Moreover, we show that the density function itself can recover the semantic content of its class, despite having been learned in the absence of any unprotected data. Our experiments show the favorable downstream performance of our approach, and highlight key downstream considerations and trade-offs in a practical ML deployment of shuffled DP.
Related papers
- Segmented Private Data Aggregation in the Multi-message Shuffle Model [9.298982907061099]
We pioneer the study of segmented private data aggregation within the multi-message shuffle model of differential privacy.
Our framework introduces flexible privacy protection for users and enhanced utility for the aggregation server.
Our framework achieves a reduction of about 50% in estimation error compared to existing approaches.
arXiv Detail & Related papers (2024-07-29T01:46:44Z) - Enhanced Privacy Bound for Shuffle Model with Personalized Privacy [32.08637708405314]
Differential Privacy (DP) is an enhanced privacy protocol which introduces an intermediate trusted server between local users and a central data curator.
It significantly amplifies the central DP guarantee by anonymizing and shuffling the local randomized data.
This work focuses on deriving the central privacy bound for a more practical setting where personalized local privacy is required by each user.
arXiv Detail & Related papers (2024-07-25T16:11:56Z) - LLM-based Privacy Data Augmentation Guided by Knowledge Distillation
with a Distribution Tutor for Medical Text Classification [67.92145284679623]
We propose a DP-based tutor that models the noised private distribution and controls samples' generation with a low privacy cost.
We theoretically analyze our model's privacy protection and empirically verify our model.
arXiv Detail & Related papers (2024-02-26T11:52:55Z) - Preserving Privacy in Federated Learning with Ensemble Cross-Domain
Knowledge Distillation [22.151404603413752]
Federated Learning (FL) is a machine learning paradigm where local nodes collaboratively train a central model.
Existing FL methods typically share model parameters or employ co-distillation to address the issue of unbalanced data distribution.
We develop a privacy preserving and communication efficient method in a FL framework with one-shot offline knowledge distillation.
arXiv Detail & Related papers (2022-09-10T05:20:31Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Mixed Differential Privacy in Computer Vision [133.68363478737058]
AdaMix is an adaptive differentially private algorithm for training deep neural network classifiers using both private and public image data.
A few-shot or even zero-shot learning baseline that ignores private data can outperform fine-tuning on a large private dataset.
arXiv Detail & Related papers (2022-03-22T06:15:43Z) - Personalization Improves Privacy-Accuracy Tradeoffs in Federated
Optimization [57.98426940386627]
We show that coordinating local learning with private centralized learning yields a generically useful and improved tradeoff between accuracy and privacy.
We illustrate our theoretical results with experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-02-10T20:44:44Z) - TOHAN: A One-step Approach towards Few-shot Hypothesis Adaptation [73.75784418508033]
In few-shot domain adaptation (FDA), classifiers for the target domain are trained with labeled data in the source domain (SD) and few labeled data in the target domain (TD)
Data usually contain private information in the current era, e.g., data distributed on personal phones.
We propose a target orientated hypothesis adaptation network (TOHAN) to solve the problem.
arXiv Detail & Related papers (2021-06-11T11:46:20Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z) - Deconvoluting Kernel Density Estimation and Regression for Locally
Differentially Private Data [14.095523601311374]
Local differential privacy has become the gold-standard of privacy literature for gathering or releasing sensitive individual data points.
However, locally differential data can twist the probability density of the data because of the additive noise used to ensure privacy.
We develop density estimation methods using smoothing kernels to remove the effect of privacy-preserving noise.
arXiv Detail & Related papers (2020-08-28T03:39:17Z) - LDP-FL: Practical Private Aggregation in Federated Learning with Local
Differential Privacy [20.95527613004989]
Federated learning is a popular approach for privacy protection that collects the local gradient information instead of real data.
Previous works do not give a practical solution due to three issues.
Last, the privacy budget explodes due to the high dimensionality of weights in deep learning models.
arXiv Detail & Related papers (2020-07-31T01:08:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.