Federated Experiment Design under Distributed Differential Privacy
- URL: http://arxiv.org/abs/2311.04375v1
- Date: Tue, 7 Nov 2023 22:38:56 GMT
- Title: Federated Experiment Design under Distributed Differential Privacy
- Authors: Wei-Ning Chen, Graham Cormode, Akash Bharadwaj, Peter Romov, Ayfer Özgür,
- Abstract summary: We focus on the rigorous protection of users' privacy while minimizing the trust toward service providers.
Although a vital component in modern A/B testing, private distributed experimentation has not previously been studied.
We show how these mechanisms can be scaled up to handle the very large number of participants commonly found in practice.
- Score: 31.06808163362162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Experiment design has a rich history dating back over a century and has found many critical applications across various fields since then. The use and collection of users' data in experiments often involve sensitive personal information, so additional measures to protect individual privacy are required during data collection, storage, and usage. In this work, we focus on the rigorous protection of users' privacy (under the notion of differential privacy (DP)) while minimizing the trust toward service providers. Specifically, we consider the estimation of the average treatment effect (ATE) under DP, while only allowing the analyst to collect population-level statistics via secure aggregation, a distributed protocol enabling a service provider to aggregate information without accessing individual data. Although a vital component in modern A/B testing workflows, private distributed experimentation has not previously been studied. To achieve DP, we design local privatization mechanisms that are compatible with secure aggregation and analyze the utility, in terms of the width of confidence intervals, both asymptotically and non-asymptotically. We show how these mechanisms can be scaled up to handle the very large number of participants commonly found in practice. In addition, when introducing DP noise, it is imperative to cleverly split privacy budgets to estimate both the mean and variance of the outcomes and carefully calibrate the confidence intervals according to the DP noise. Last, we present comprehensive experimental evaluations of our proposed schemes and show the privacy-utility trade-offs in experiment design.
Related papers
- Segmented Private Data Aggregation in the Multi-message Shuffle Model [6.436165623346879]
We pioneer the study of segmented private data aggregation within the multi-message shuffle model of differential privacy.
Our framework introduces flexible privacy protection for users and enhanced utility for the aggregation server.
Our framework achieves a reduction of about 50% in estimation error compared to existing approaches.
arXiv Detail & Related papers (2024-07-29T01:46:44Z) - Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms.
Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z) - A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing.
Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data.
Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z) - Causal Inference with Differentially Private (Clustered) Outcomes [16.166525280886578]
Estimating causal effects from randomized experiments is only feasible if participants agree to reveal their responses.
We suggest a new differential privacy mechanism, Cluster-DP, which leverages any given cluster structure.
We show that, depending on an intuitive measure of cluster quality, we can improve the variance loss while maintaining our privacy guarantees.
arXiv Detail & Related papers (2023-08-02T05:51:57Z) - Breaking the Communication-Privacy-Accuracy Tradeoff with
$f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability.
We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP)
More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z) - Differentially Private Confidence Intervals for Proportions under Stratified Random Sampling [14.066813980992132]
With the increase of data privacy awareness, developing a private version of confidence intervals has gained growing attention.
Recent work has been done around differentially private confidence intervals, yet rigorous methodologies on differentially private confidence intervals have not been studied.
We propose three differentially private algorithms for constructing confidence intervals for proportions under stratified random sampling.
arXiv Detail & Related papers (2023-01-19T21:25:41Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - DP2-Pub: Differentially Private High-Dimensional Data Publication with
Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases.
splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget.
We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z) - Production of Categorical Data Verifying Differential Privacy:
Conception and Applications to Machine Learning [0.0]
Differential privacy is a formal definition that allows quantifying the privacy-utility trade-off.
With the local DP (LDP) model, users can sanitize their data locally before transmitting it to the server.
In all cases, we concluded that differentially private ML models achieve nearly the same utility metrics as non-private ones.
arXiv Detail & Related papers (2022-04-02T12:50:14Z) - Combining Public and Private Data [7.975795748574989]
We introduce a mixed estimator of the mean optimized to minimize variance.
We argue that our mechanism is preferable to techniques that preserve the privacy of individuals by subsampling data proportionally to the privacy needs of users.
arXiv Detail & Related papers (2021-10-29T23:25:49Z) - Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data.
perturbations chosen independently at every agent, resulting in a significant performance loss.
We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.