Differentially Private Assouad, Fano, and Le Cam
- URL: http://arxiv.org/abs/2004.06830v3
- Date: Mon, 2 Nov 2020 03:03:57 GMT
- Title: Differentially Private Assouad, Fano, and Le Cam
- Authors: Jayadev Acharya, Ziteng Sun, Huanyu Zhang
- Abstract summary: Le Cam's method, Fano's inequality, and Assouad's lemma are three techniques to prove lower bounds for statistical estimation tasks.
We propose their analogues under central differential privacy. Our results are simple, easy to apply and we use them to establish sample complexity bounds in several estimation tasks.
- Score: 32.509176784239216
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Le Cam's method, Fano's inequality, and Assouad's lemma are three widely used
techniques to prove lower bounds for statistical estimation tasks. We propose
their analogues under central differential privacy. Our results are simple,
easy to apply and we use them to establish sample complexity bounds in several
estimation tasks. We establish the optimal sample complexity of discrete
distribution estimation under total variation distance and $\ell_2$ distance.
We also provide lower bounds for several other distribution classes, including
product distributions and Gaussian mixtures that are tight up to logarithmic
factors. The technical component of our paper relates coupling between
distributions to the sample complexity of estimation under differential
privacy.
Related papers
- Restricted Spectral Gap Decomposition for Simulated Tempering Targeting Mixture Distributions [3.7577421880330535]
We consider simulated tempering combined with an arbitrary local chain Monte Carlo sampler.<n>We present a new decomposition theorem that provides a lower bound on the restricted spectral gap of the algorithm for sampling from mixture distributions.
arXiv Detail & Related papers (2025-05-21T03:28:55Z) - Sublinear Algorithms for Wasserstein and Total Variation Distances: Applications to Fairness and Privacy Auditing [7.81603404636933]
We propose a generic framework to learn the probability and cumulative distribution functions (PDFs and CDFs) of a sub-Weibull.<n>We compute mergeable summaries of distributions from the stream of samples while requiring only sublinear space.<n>Our algorithms significantly improves on the existing methods for distance estimation incurring super-linear time and linear space complexities.
arXiv Detail & Related papers (2025-03-10T18:57:48Z) - Differentially Private Multi-Sampling from Distributions [4.292685318253575]
We study the sample complexity of DP emphsingle-sampling i.e., the minimum number of samples needed to perform this task.
We define two variants of emphmulti-sampling, where the goal is to privately approximate $m>1$ samples.
arXiv Detail & Related papers (2024-12-13T19:14:05Z) - Theory on Score-Mismatched Diffusion Models and Zero-Shot Conditional Samplers [49.97755400231656]
We present the first performance guarantee with explicit dimensional general score-mismatched diffusion samplers.
We show that score mismatches result in an distributional bias between the target and sampling distributions, proportional to the accumulated mismatch between the target and training distributions.
This result can be directly applied to zero-shot conditional samplers for any conditional model, irrespective of measurement noise.
arXiv Detail & Related papers (2024-10-17T16:42:12Z) - Deep Generative Sampling in the Dual Divergence Space: A Data-efficient & Interpretative Approach for Generative AI [29.13807697733638]
We build on the remarkable achievements in generative sampling of natural images.
We propose an innovative challenge, potentially overly ambitious, which involves generating samples that resemble images.
The statistical challenge lies in the small sample size, sometimes consisting of a few hundred subjects.
arXiv Detail & Related papers (2024-04-10T22:35:06Z) - Sampling Multimodal Distributions with the Vanilla Score: Benefits of
Data-Based Initialization [19.19974210314107]
Hyv"arinen proposed vanilla score matching as a way to learn distributions from data.
We prove that the Langevin diffusion with early stopping, at the empirical distribution, and run on a score function estimated from data successfully generates natural multimodal distributions.
arXiv Detail & Related papers (2023-10-03T03:06:59Z) - Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling [17.415391025051434]
We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Ito diffusions associated with weighted Poincar'e inequalities.
Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $epsilon$ close to the target distribution in the Wasserstein-2 metric.
arXiv Detail & Related papers (2023-03-01T15:16:03Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - From Optimality to Robustness: Dirichlet Sampling Strategies in
Stochastic Bandits [0.0]
We study a generic Dirichlet Sampling (DS) algorithm, based on pairwise comparisons of empirical indices computed with re-sampling of the arms' observations.
We show that different variants of this strategy achieve provably optimal regret guarantees when the distributions are bounded and logarithmic regret for semi-bounded distributions with a mild quantile condition.
arXiv Detail & Related papers (2021-11-18T14:34:21Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Partition-Guided GANs [63.980473635585234]
We design a partitioner that breaks the space into smaller regions, each having a simpler distribution, and training a different generator for each partition.
This is done in an unsupervised manner without requiring any labels.
Experimental results on various standard benchmarks show that the proposed unsupervised model outperforms several recent methods.
arXiv Detail & Related papers (2021-04-02T00:06:53Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.