Conditional Density Estimations from Privacy-Protected Data
- URL: http://arxiv.org/abs/2310.12781v3
- Date: Sat, 30 Dec 2023 15:13:46 GMT
- Title: Conditional Density Estimations from Privacy-Protected Data
- Authors: Yifei Xiong, Nianqiao P. Ju, Sanguo Zhang
- Abstract summary: We propose simulation-based inference methods from privacy-protected datasets.
We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Many modern statistical analysis and machine learning applications require
training models on sensitive user data. Differential privacy provides a formal
guarantee that individual-level information about users does not leak. In this
framework, randomized algorithms inject calibrated noise into the confidential
data, resulting in privacy-protected datasets or queries. However, restricting
access to only privatized data during statistical analysis makes it
computationally challenging to make valid inferences on the parameters
underlying the confidential data. In this work, we propose simulation-based
inference methods from privacy-protected datasets. In addition to sequential
Monte Carlo approximate Bayesian computation, we use neural conditional density
estimators as a flexible family of distributions to approximate the posterior
distribution of model parameters given the observed private query results. We
illustrate our methods on discrete time-series data under an infectious disease
model and with ordinary linear regression models. Illustrating the
privacy-utility trade-off, our experiments and analysis demonstrate the
necessity and feasibility of designing valid statistical inference procedures
to correct for biases introduced by the privacy-protection mechanisms.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Differentially Private Covariate Balancing Causal Inference [8.133739801185271]
Differential privacy is the leading mathematical framework for privacy protection.
Our algorithm produces both point and interval estimators with statistical guarantees, such as consistency and rate optimality, under a given privacy budget.
arXiv Detail & Related papers (2024-10-18T18:02:13Z) - Initialization Matters: Privacy-Utility Analysis of Overparameterized
Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets.
We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z) - Differentially Private Linear Regression with Linked Data [3.9325957466009203]
Differential privacy, a mathematical notion from computer science, is a rising tool offering robust privacy guarantees.
Recent work focuses on developing differentially private versions of individual statistical and machine learning tasks.
We present two differentially private algorithms for linear regression with linked data.
arXiv Detail & Related papers (2023-08-01T21:00:19Z) - Epsilon*: Privacy Metric for Machine Learning Models [7.461284823977013]
Epsilon* is a new metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies.
It requires only black-box access to model predictions, does not require training data re-sampling or model re-training, and can be used to measure the privacy risk of models not trained with differential privacy.
arXiv Detail & Related papers (2023-07-21T00:49:07Z) - Probing the Transition to Dataset-Level Privacy in ML Models Using an
Output-Specific and Data-Resolved Privacy Profile [23.05994842923702]
We study a privacy metric that quantifies the extent to which a model trained on a dataset using a Differential Privacy mechanism is covered" by each of the distributions resulting from training on neighboring datasets.
We show that the privacy profile can be used to probe an observed transition to indistinguishability that takes place in the neighboring distributions as $epsilon$ decreases.
arXiv Detail & Related papers (2023-06-27T20:39:07Z) - Simulation-based, Finite-sample Inference for Privatized Data [14.218697973204065]
We propose a simulation-based "repro sample" approach to produce statistically valid confidence intervals and hypothesis tests.
We show that this methodology is applicable to a wide variety of private inference problems.
arXiv Detail & Related papers (2023-03-09T15:19:31Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Private Prediction Sets [72.75711776601973]
Machine learning systems need reliable uncertainty quantification and protection of individuals' privacy.
We present a framework that treats these two desiderata jointly.
We evaluate the method on large-scale computer vision datasets.
arXiv Detail & Related papers (2021-02-11T18:59:11Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.