Related papers: Combining Public and Private Data

Combining Public and Private Data

URL: http://arxiv.org/abs/2111.00115v1
Date: Fri, 29 Oct 2021 23:25:49 GMT
Title: Combining Public and Private Data
Authors: Cecilia Ferrando, Jennifer Gillenwater, Alex Kulesza
Abstract summary: We introduce a mixed estimator of the mean optimized to minimize variance. We argue that our mechanism is preferable to techniques that preserve the privacy of individuals by subsampling data proportionally to the privacy needs of users.
Score: 7.975795748574989
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Differential privacy is widely adopted to provide provable privacy guarantees in data analysis. We consider the problem of combining public and private data (and, more generally, data with heterogeneous privacy needs) for estimating aggregate statistics. We introduce a mixed estimator of the mean optimized to minimize the variance. We argue that our mechanism is preferable to techniques that preserve the privacy of individuals by subsampling data proportionally to the privacy needs of users. Similarly, we present a mixed median estimator based on the exponential mechanism. We compare our mechanisms to the methods proposed in Jorgensen et al. [2015]. Our experiments provide empirical evidence that our mechanisms often outperform the baseline methods.

Related papers

On the MIA Vulnerability Gap Between Private GANs and Diffusion Models [51.53790101362898]
Generative Adversarial Networks (GANs) and diffusion models have emerged as leading approaches for high-quality image synthesis.<n>We present the first unified theoretical and empirical analysis of the privacy risks faced by differentially private generative models.
arXiv Detail & Related papers (2025-09-03T14:18:22Z)
Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines. We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z)
Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner. Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z)
Private Estimation when Data and Privacy Demands are Correlated [5.755004576310333]
Differential Privacy is the current gold-standard for ensuring privacy for statistical queries. We consider the problems of empirical mean estimation for univariate data and frequency estimation for categorical data. We establish theoretical performance guarantees for our proposed algorithms, under both PAC error and mean-squared error.
arXiv Detail & Related papers (2024-07-15T22:46:02Z)
A Simple and Practical Method for Reducing the Disparate Impact of Differential Privacy [21.098175634158043]
Differentially private (DP) mechanisms have been deployed in a variety of high-impact social settings. The impact of DP on utility can vary significantly among different sub-populations. A simple way to reduce this disparity is with stratification.
arXiv Detail & Related papers (2023-12-18T21:19:35Z)
Federated Experiment Design under Distributed Differential Privacy [31.06808163362162]
We focus on the rigorous protection of users' privacy while minimizing the trust toward service providers. Although a vital component in modern A/B testing, private distributed experimentation has not previously been studied. We show how these mechanisms can be scaled up to handle the very large number of participants commonly found in practice.
arXiv Detail & Related papers (2023-11-07T22:38:56Z)
Conditional Density Estimations from Privacy-Protected Data [0.0]
We propose simulation-based inference methods from privacy-protected datasets. We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models.
arXiv Detail & Related papers (2023-10-19T14:34:17Z)
Causal Inference with Differentially Private (Clustered) Outcomes [16.166525280886578]
Estimating causal effects from randomized experiments is only feasible if participants agree to reveal their responses. We suggest a new differential privacy mechanism, Cluster-DP, which leverages any given cluster structure. We show that, depending on an intuitive measure of cluster quality, we can improve the variance loss while maintaining our privacy guarantees.
arXiv Detail & Related papers (2023-08-02T05:51:57Z)
Data Analytics with Differential Privacy [0.0]
We develop differentially private algorithms to analyze distributed and streaming data. In the distributed model, we consider the particular problem of learning -- in a distributed fashion -- a global model of the data. We offer one of the strongest privacy guarantees for the streaming model, user-level pan-privacy.
arXiv Detail & Related papers (2023-07-20T17:43:29Z)
Breaking the Communication-Privacy-Accuracy Tradeoff with $f$-Differential Privacy [51.11280118806893]
We consider a federated data analytics problem in which a server coordinates the collaborative data analysis of multiple users with privacy concerns and limited communication capability. We study the local differential privacy guarantees of discrete-valued mechanisms with finite output space through the lens of $f$-differential privacy (DP) More specifically, we advance the existing literature by deriving tight $f$-DP guarantees for a variety of discrete-valued mechanisms.
arXiv Detail & Related papers (2023-02-19T16:58:53Z)
Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge. Existing private generative models are struggling with the utility of synthetic samples. We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z)
DP2-Pub: Differentially Private High-Dimensional Data Publication with Invariant Post Randomization [58.155151571362914]
We propose a differentially private high-dimensional data publication mechanism (DP2-Pub) that runs in two phases. splitting attributes into several low-dimensional clusters with high intra-cluster cohesion and low inter-cluster coupling helps obtain a reasonable privacy budget. We also extend our DP2-Pub mechanism to the scenario with a semi-honest server which satisfies local differential privacy.
arXiv Detail & Related papers (2022-08-24T17:52:43Z)
Private Prediction Sets [72.75711776601973]
Machine learning systems need reliable uncertainty quantification and protection of individuals' privacy. We present a framework that treats these two desiderata jointly. We evaluate the method on large-scale computer vision datasets.
arXiv Detail & Related papers (2021-02-11T18:59:11Z)
Graph-Homomorphic Perturbations for Private Decentralized Learning [64.26238893241322]
Local exchange of estimates allows inference of data based on private data. perturbations chosen independently at every agent, resulting in a significant performance loss. We propose an alternative scheme, which constructs perturbations according to a particular nullspace condition, allowing them to be invisible.
arXiv Detail & Related papers (2020-10-23T10:35:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.