Exactly Minimax-Optimal Locally Differentially Private Sampling
- URL: http://arxiv.org/abs/2410.22699v1
- Date: Wed, 30 Oct 2024 05:13:18 GMT
- Title: Exactly Minimax-Optimal Locally Differentially Private Sampling
- Authors: Hyun-Young Park, Shahab Asoodeh, Si-Hyeon Lee,
- Abstract summary: We define the fundamental PUT of private sampling in the minimax sense, using the f-divergence between original and sampling distributions as the utility measure.
We characterize the exact PUT for both finite and continuous data spaces under mild conditions on the data distributions, and propose sampling mechanisms that are universally optimal for all f-divergences.
- Score: 12.587817635325266
- License:
- Abstract: The sampling problem under local differential privacy has recently been studied with potential applications to generative models, but a fundamental analysis of its privacy-utility trade-off (PUT) remains incomplete. In this work, we define the fundamental PUT of private sampling in the minimax sense, using the f-divergence between original and sampling distributions as the utility measure. We characterize the exact PUT for both finite and continuous data spaces under some mild conditions on the data distributions, and propose sampling mechanisms that are universally optimal for all f-divergences. Our numerical experiments demonstrate the superiority of our mechanisms over baselines, in terms of theoretical utilities for finite data space and of empirical utilities for continuous data space.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - A Likelihood Based Approach to Distribution Regression Using Conditional Deep Generative Models [6.647819824559201]
We study the large-sample properties of a likelihood-based approach for estimating conditional deep generative models.
Our results lead to the convergence rate of a sieve maximum likelihood estimator for estimating the conditional distribution.
arXiv Detail & Related papers (2024-10-02T20:46:21Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Optimal Locally Private Nonparametric Classification with Public Data [2.631955426232593]
We investigate the problem of public data assisted non-interactive Local Differentially Private (LDP) learning with a focus on non-parametric classification.
Under the posterior drift assumption, we derive the mini-max optimal convergence rate with LDP constraint.
We present a novel approach, the locally differentially private classification tree, which attains the mini-max optimal convergence rate.
arXiv Detail & Related papers (2023-11-19T16:35:01Z) - On the Inherent Privacy Properties of Discrete Denoising Diffusion Models [17.773335593043004]
We present the pioneering theoretical exploration of the privacy preservation inherent in discrete diffusion models.
Our framework elucidates the potential privacy leakage for each data point in a given training dataset.
Our bounds also show that training with $s$-sized data points leads to a surge in privacy leakage.
arXiv Detail & Related papers (2023-10-24T05:07:31Z) - Private Set Generation with Discriminative Information [63.851085173614]
Differentially private data generation is a promising solution to the data privacy challenge.
Existing private generative models are struggling with the utility of synthetic samples.
We introduce a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-07T10:02:55Z) - Differentially private multivariate medians [4.588028371034407]
We develop novel finite-sample performance guarantees for differentially private depth-based medians.
We show that under Cauchy marginals, the cost of heavy-tailed location estimation outweighs the cost of privacy.
arXiv Detail & Related papers (2022-10-12T17:56:04Z) - ManiFlow: Implicitly Representing Manifolds with Normalizing Flows [145.9820993054072]
Normalizing Flows (NFs) are flexible explicit generative models that have been shown to accurately model complex real-world data distributions.
We propose an optimization objective that recovers the most likely point on the manifold given a sample from the perturbed distribution.
Finally, we focus on 3D point clouds for which we utilize the explicit nature of NFs, i.e. surface normals extracted from the gradient of the log-likelihood and the log-likelihood itself.
arXiv Detail & Related papers (2022-08-18T16:07:59Z) - Compressive Privatization: Sparse Distribution Estimation under Locally
Differentially Privacy [18.43218511751587]
We show that as long as the target distribution is sparse or approximately sparse, the number of samples needed could be significantly reduced.
Our mechanism does privatization and dimensionality reduction simultaneously, and the sample complexity will only depend on the reduced dimensionality.
arXiv Detail & Related papers (2020-12-03T17:14:23Z) - A Discriminative Technique for Multiple-Source Adaptation [55.5865665284915]
We present a new discriminative technique for the multiple-source adaptation, MSA, problem.
Our solution only requires conditional probabilities that can easily be accurately estimated from unlabeled data from the source domains.
Our experiments with real-world applications further demonstrate that our new discriminative MSA algorithm outperforms the previous generative solution.
arXiv Detail & Related papers (2020-08-25T14:06:15Z) - GANs with Conditional Independence Graphs: On Subadditivity of
Probability Divergences [70.30467057209405]
Generative Adversarial Networks (GANs) are modern methods to learn the underlying distribution of a data set.
GANs are designed in a model-free fashion where no additional information about the underlying distribution is available.
We propose a principled design of a model-based GAN that uses a set of simple discriminators on the neighborhoods of the Bayes-net/MRF.
arXiv Detail & Related papers (2020-03-02T04:31:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.