Statistic Selection and MCMC for Differentially Private Bayesian
Estimation
- URL: http://arxiv.org/abs/2203.13377v2
- Date: Mon, 28 Mar 2022 14:32:37 GMT
- Title: Statistic Selection and MCMC for Differentially Private Bayesian
Estimation
- Authors: Baris Alparslan and Sinan Yildirim
- Abstract summary: This paper concerns differentially private Bayesian estimation of the parameters of a population distribution.
We find out that, the statistic that is most informative in a non-privacy setting may not be the optimal choice under the privacy restrictions.
We propose several Monte Carlo-based numerical estimation methods for calculating the Fisher information for those settings.
- Score: 1.14219428942199
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper concerns differentially private Bayesian estimation of the
parameters of a population distribution, when a statistic of a sample from that
population is shared in noise to provide differential privacy.
This work mainly addresses two problems: (1) What statistic of the sample
should be shared privately? For the first question, i.e., the one about
statistic selection, we promote using the Fisher information. We find out that,
the statistic that is most informative in a non-privacy setting may not be the
optimal choice under the privacy restrictions. We provide several examples to
support that point. We consider several types of data sharing settings and
propose several Monte Carlo-based numerical estimation methods for calculating
the Fisher information for those settings. The second question concerns
inference: (2) Based on the shared statistics, how could we perform effective
Bayesian inference? We propose several Markov chain Monte Carlo (MCMC)
algorithms for sampling from the posterior distribution of the parameter given
the noisy statistic. The proposed MCMC algorithms can be preferred over one
another depending on the problem. For example, when the shared statistics is
additive and added Gaussian noise, a simple Metropolis-Hasting algorithm that
utilizes the central limit theorem is a decent choice. We propose more advanced
MCMC algorithms for several other cases of practical relevance.
Our numerical examples involve comparing several candidate statistics to be
shared privately. For each statistic, we perform Bayesian estimation based on
the posterior distribution conditional on the privatized version of that
statistic. We demonstrate that, the relative performance of a statistic, in
terms of the mean squared error of the Bayesian estimator based on the
corresponding privatized statistic, is adequately predicted by the Fisher
information of the privatized statistic.
Related papers
- Exact and Efficient Bayesian Inference for Privacy Risk Quantification (Extended Version) [0.0]
Privug is a method to quantify privacy risks of data analytics programs by analyzing their source code.
The inference engine is implemented for a subset of Python programs.
We evaluate the method by analyzing privacy risks in programs to release public statistics.
arXiv Detail & Related papers (2023-08-31T13:04:04Z) - Beyond Normal: On the Evaluation of Mutual Information Estimators [52.85079110699378]
We show how to construct a diverse family of distributions with known ground-truth mutual information.
We provide guidelines for practitioners on how to select appropriate estimator adapted to the difficulty of problem considered.
arXiv Detail & Related papers (2023-06-19T17:26:34Z) - Private Statistical Estimation of Many Quantiles [0.41232474244672235]
Given a distribution and access to i.i.d. samples, we study the estimation of the inverse of its cumulative distribution function (the quantile function) at specific points.
This work studies the estimation of many statistical quantiles under differential privacy.
arXiv Detail & Related papers (2023-02-14T09:59:56Z) - Differentially Private Distributed Bayesian Linear Regression with MCMC [0.966840768820136]
We consider a distributed setting where multiple parties hold parts of the data and share certain summary statistics of their portions in privacy-preserving noise.
We develop a novel generative statistical model for privately shared statistics, which exploits a useful distributional relation between the summary statistics of linear regression.
We provide numerical results on both real and simulated data, which demonstrate that the proposed algorithms provide well-rounded estimation and prediction.
arXiv Detail & Related papers (2023-01-31T17:27:05Z) - A Bias-Accuracy-Privacy Trilemma for Statistical Estimation [16.365507345447803]
We show that no algorithm can simultaneously have low bias, low error, and low privacy loss for arbitrary distributions.
We show that unbiased mean estimation is possible under a more permissive notion of differential privacy.
arXiv Detail & Related papers (2023-01-30T23:40:20Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Algorithms for Adaptive Experiments that Trade-off Statistical Analysis
with Reward: Combining Uniform Random Assignment and Reward Maximization [50.725191156128645]
Multi-armed bandit algorithms like Thompson Sampling can be used to conduct adaptive experiments.
We present simulations for 2-arm experiments that explore two algorithms that combine the benefits of uniform randomization for statistical analysis.
arXiv Detail & Related papers (2021-12-15T22:11:58Z) - Universal Off-Policy Evaluation [64.02853483874334]
We take the first steps towards a universal off-policy estimator (UnO)
We use UnO for estimating and simultaneously bounding the mean, variance, quantiles/median, inter-quantile range, CVaR, and the entire cumulative distribution of returns.
arXiv Detail & Related papers (2021-04-26T18:54:31Z) - Statistical Efficiency of Thompson Sampling for Combinatorial
Semi-Bandits [56.31950477139053]
We investigate multi-armed bandit with semi-bandit feedback (CMAB)
We analyze variants of the Combinatorial Thompson Sampling policy (CTS)
This last result gives us an alternative to the Efficient Sampling for Combinatorial Bandit policy (ESCB)
arXiv Detail & Related papers (2020-06-11T17:12:11Z) - Nonparametric Estimation of the Fisher Information and Its Applications [82.00720226775964]
This paper considers the problem of estimation of the Fisher information for location from a random sample of size $n$.
An estimator proposed by Bhattacharya is revisited and improved convergence rates are derived.
A new estimator, termed a clipped estimator, is proposed.
arXiv Detail & Related papers (2020-05-07T17:21:56Z) - Propose, Test, Release: Differentially private estimation with high
probability [9.25177374431812]
We introduce a new general version of the PTR mechanism that allows us to derive high probability error bounds for differentially private estimators.
Our algorithms provide the first statistical guarantees for differentially private estimation of the median and mean without any boundedness assumptions on the data.
arXiv Detail & Related papers (2020-02-19T01:29:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.