Best Linear Unbiased Estimate from Privatized Histograms
- URL: http://arxiv.org/abs/2409.04387v3
- Date: Tue, 29 Oct 2024 15:49:09 GMT
- Title: Best Linear Unbiased Estimate from Privatized Histograms
- Authors: Jordan Awan, Adam Edwards, Paul Bartholomew, Andrew Sillers,
- Abstract summary: In differential privacy (DP) mechanisms, it can be beneficial to release "redundant" outputs.
We show that the minimum variance processing is a linear projection.
We propose the Scalable Algorithm Efficient for Best Linear Unbiased Estimate (SEA BLUE)
- Score: 6.17477133700348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In differential privacy (DP) mechanisms, it can be beneficial to release "redundant" outputs, in the sense that a quantity can be estimated by combining different combinations of privatized values. Indeed, this structure is present in the DP 2020 Decennial Census products published by the U.S. Census Bureau. With this structure, the DP output can be improved by enforcing self-consistency (i.e., estimators obtained by combining different values result in the same estimate) and we show that the minimum variance processing is a linear projection. However, standard projection algorithms are too computationally expensive in terms of both memory and execution time for applications such as the Decennial Census. We propose the Scalable Efficient Algorithm for Best Linear Unbiased Estimate (SEA BLUE), based on a two step process of aggregation and differencing that 1) enforces self-consistency through a linear and unbiased procedure, 2) is computationally and memory efficient, 3) achieves the minimum variance solution under certain structural assumptions, and 4) is empirically shown to be robust to violations of these structural assumptions. We propose three methods of calculating confidence intervals from our estimates, under various assumptions. We apply SEA BLUE to two 2010 Census demonstration products, illustrating its scalability and validity.
Related papers
- Full-Information Estimation For Hierarchical Data [0.43512163406552007]
The U.S. Census Bureau's 2020 Disclosure Avoidance System (DAS) bases its output on noisy measurements.
These noisy measurements are observed in a set of hierarchical geographic units, e.g., the U.S. as a whole, states, counties, census tracts, and census blocks.
This paper describes a method to leverage the hierarchical structure within these noisy measurements to compute confidence intervals for arbitrary tabulations.
arXiv Detail & Related papers (2024-04-19T20:18:16Z) - Online Estimation via Offline Estimation: An Information-Theoretic Framework [75.80823630681323]
Motivated by connections between estimation and interactive decision making, we ask: is it possible to convert offline estimation algorithms into online estimation algorithms?
We introduce a new framework, Oracle-Efficient Online Estimation (OEOE), where the learner can only interact with the data stream indirectly through a sequence of offline estimators produced by a black-box algorithm operating on the stream.
arXiv Detail & Related papers (2024-04-15T20:19:18Z) - Likelihood Ratio Confidence Sets for Sequential Decision Making [51.66638486226482]
We revisit the likelihood-based inference principle and propose to use likelihood ratios to construct valid confidence sequences.
Our method is especially suitable for problems with well-specified likelihoods.
We show how to provably choose the best sequence of estimators and shed light on connections to online convex optimization.
arXiv Detail & Related papers (2023-11-08T00:10:21Z) - Variational Linearized Laplace Approximation for Bayesian Deep Learning [11.22428369342346]
We propose a new method for approximating Linearized Laplace Approximation (LLA) using a variational sparse Gaussian Process (GP)
Our method is based on the dual RKHS formulation of GPs and retains, as the predictive mean, the output of the original DNN.
It allows for efficient optimization, which results in sub-linear training time in the size of the training dataset.
arXiv Detail & Related papers (2023-02-24T10:32:30Z) - Differentially Private Estimation via Statistical Depth [0.0]
Two notions of statistical depth are used to motivate new approximate DP location and regression estimators.
To avoid requiring that users specify a priori bounds on the estimates and/or the observations, variants of these DP mechanisms are described.
arXiv Detail & Related papers (2022-07-26T01:59:07Z) - Scaling Structured Inference with Randomization [64.18063627155128]
We propose a family of dynamic programming (RDP) randomized for scaling structured models to tens of thousands of latent states.
Our method is widely applicable to classical DP-based inference.
It is also compatible with automatic differentiation so can be integrated with neural networks seamlessly.
arXiv Detail & Related papers (2021-12-07T11:26:41Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Adaptive Estimation and Uniform Confidence Bands for Nonparametric
Structural Functions and Elasticities [2.07706336594149]
We introduce two data-driven procedures for optimal estimation and inference in nonparametric models.
We estimate the elasticity of the intensive margin of firm exports in a monopolistic competition model of international trade.
arXiv Detail & Related papers (2021-07-25T18:46:33Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - Neural Methods for Point-wise Dependency Estimation [129.93860669802046]
We focus on estimating point-wise dependency (PD), which quantitatively measures how likely two outcomes co-occur.
We demonstrate the effectiveness of our approaches in 1) MI estimation, 2) self-supervised representation learning, and 3) cross-modal retrieval task.
arXiv Detail & Related papers (2020-06-09T23:26:15Z) - Online Covariance Matrix Estimation in Stochastic Gradient Descent [10.153224593032677]
gradient descent (SGD) is widely used for parameter estimation especially for huge data sets and online learning.
This paper aims at quantifying statistical inference of SGD-based estimates in an online setting.
arXiv Detail & Related papers (2020-02-10T17:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.