High-Dimensional Differentially-Private EM Algorithm: Methods and
Near-Optimal Statistical Guarantees
- URL: http://arxiv.org/abs/2104.00245v1
- Date: Thu, 1 Apr 2021 04:08:34 GMT
- Title: High-Dimensional Differentially-Private EM Algorithm: Methods and
Near-Optimal Statistical Guarantees
- Authors: Zhe Zhang and Linjun Zhang
- Abstract summary: We develop a general framework to design differentially private expectation-maximization (EM) algorithms in high-dimensional latent variable models.
In each model, we establish the near-optimal rate of convergence with differential privacy constraints.
We propose a near rate-optimal EM algorithm with differential privacy guarantees in this setting.
- Score: 8.089708900273804
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we develop a general framework to design differentially
private expectation-maximization (EM) algorithms in high-dimensional latent
variable models, based on the noisy iterative hard-thresholding. We derive the
statistical guarantees of the proposed framework and apply it to three specific
models: Gaussian mixture, mixture of regression, and regression with missing
covariates. In each model, we establish the near-optimal rate of convergence
with differential privacy constraints, and show the proposed algorithm is
minimax rate optimal up to logarithm factors. The technical tools developed for
the high-dimensional setting are then extended to the classic low-dimensional
latent variable models, and we propose a near rate-optimal EM algorithm with
differential privacy guarantees in this setting. Simulation studies and real
data analysis are conducted to support our results.
Related papers
- Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Quantized Hierarchical Federated Learning: A Robust Approach to
Statistical Heterogeneity [3.8798345704175534]
We present a novel hierarchical federated learning algorithm that incorporates quantization for communication-efficiency.
We offer a comprehensive analytical framework to evaluate its optimality gap and convergence rate.
Our findings reveal that our algorithm consistently achieves high learning accuracy over a range of parameters.
arXiv Detail & Related papers (2024-03-03T15:40:24Z) - kNN Algorithm for Conditional Mean and Variance Estimation with
Automated Uncertainty Quantification and Variable Selection [8.429136647141487]
We introduce a kNN-based regression method that synergizes the scalability and adaptability of traditional non-parametric kNN models.
This method focuses on accurately estimating the conditional mean and variance of random response variables.
It is particularly notable in biomedical applications as demonstrated in two case studies.
arXiv Detail & Related papers (2024-02-02T18:54:18Z) - On the Computational Complexity of Private High-dimensional Model Selection [18.964255744068122]
We consider the problem of model selection in a high-dimensional sparse linear regression model under privacy constraints.
We propose an efficient Metropolis-Hastings algorithm and under certain regularity conditions, we establish that it enjoys mixing time to its stationary distribution.
arXiv Detail & Related papers (2023-10-11T19:53:15Z) - Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers.
We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z) - When to Update Your Model: Constrained Model-based Reinforcement
Learning [50.74369835934703]
We propose a novel and general theoretical scheme for a non-decreasing performance guarantee of model-based RL (MBRL)
Our follow-up derived bounds reveal the relationship between model shifts and performance improvement.
A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns.
arXiv Detail & Related papers (2022-10-15T17:57:43Z) - Optimization of Annealed Importance Sampling Hyperparameters [77.34726150561087]
Annealed Importance Sampling (AIS) is a popular algorithm used to estimates the intractable marginal likelihood of deep generative models.
We present a parameteric AIS process with flexible intermediary distributions and optimize the bridging distributions to use fewer number of steps for sampling.
We assess the performance of our optimized AIS for marginal likelihood estimation of deep generative models and compare it to other estimators.
arXiv Detail & Related papers (2022-09-27T07:58:25Z) - Heterogeneous Tensor Mixture Models in High Dimensions [5.656785831541303]
We consider the problem of jointly introducing a flexible high-dimensional tensor mixture model with heterogeneous covariances.
We show that our method converges geometrically to a neighborhood that is statistical of the true parameter.
Our analysis identifies important brain regions for diagnosis in an autism spectrum disorder.
arXiv Detail & Related papers (2021-04-15T21:06:16Z) - Autoregressive Score Matching [113.4502004812927]
We propose autoregressive conditional score models (AR-CSM) where we parameterize the joint distribution in terms of the derivatives of univariable log-conditionals (scores)
For AR-CSM models, this divergence between data and model distributions can be computed and optimized efficiently, requiring no expensive sampling or adversarial training.
We show with extensive experimental results that it can be applied to density estimation on synthetic data, image generation, image denoising, and training latent variable models with implicit encoders.
arXiv Detail & Related papers (2020-10-24T07:01:24Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - Uncertainty Modelling in Risk-averse Supply Chain Systems Using
Multi-objective Pareto Optimization [0.0]
One of the arduous tasks in supply chain modelling is to build robust models against irregular variations.
We have introduced a novel methodology namely, Pareto Optimization to handle uncertainties and bound the entropy of such uncertainties by explicitly modelling them under some apriori assumptions.
arXiv Detail & Related papers (2020-04-24T21:04:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.