The Limits of Pan Privacy and Shuffle Privacy for Learning and
Estimation
- URL: http://arxiv.org/abs/2009.08000v3
- Date: Thu, 3 Dec 2020 19:05:20 GMT
- Title: The Limits of Pan Privacy and Shuffle Privacy for Learning and
Estimation
- Authors: Albert Cheu and Jonathan Ullman
- Abstract summary: We show that for a variety of high-dimensional learning and estimation problems, the shuffle model and the pan-private model incur an exponential price in sample complexity relative to the central model.
Our work gives the first non-trivial lower bounds for these problems for both the pan-private model and the general multi-message shuffle model.
- Score: 3.2942333712377083
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There has been a recent wave of interest in intermediate trust models for
differential privacy that eliminate the need for a fully trusted central data
collector, but overcome the limitations of local differential privacy. This
interest has led to the introduction of the shuffle model (Cheu et al.,
EUROCRYPT 2019; Erlingsson et al., SODA 2019) and revisiting the pan-private
model (Dwork et al., ITCS 2010). The message of this line of work is that, for
a variety of low-dimensional problems -- such as counts, means, and histograms
-- these intermediate models offer nearly as much power as central differential
privacy. However, there has been considerably less success using these models
for high-dimensional learning and estimation problems. In this work, we show
that, for a variety of high-dimensional learning and estimation problems, both
the shuffle model and the pan-private model inherently incur an exponential
price in sample complexity relative to the central model. For example, we show
that, private agnostic learning of parity functions over $d$ bits requires
$\Omega(2^{d/2})$ samples in these models, and privately selecting the most
common attribute from a set of $d$ choices requires $\Omega(d^{1/2})$ samples,
both of which are exponential separations from the central model. Our work
gives the first non-trivial lower bounds for these problems for both the
pan-private model and the general multi-message shuffle model.
Related papers
- Beyond Closure Models: Learning Chaotic-Systems via Physics-Informed Neural Operators [78.64101336150419]
Predicting the long-term behavior of chaotic systems is crucial for various applications such as climate modeling.
An alternative approach to such a full-resolved simulation is using a coarse grid and then correcting its errors through a temporalittext model.
We propose an alternative end-to-end learning approach using a physics-informed neural operator (PINO) that overcomes this limitation.
arXiv Detail & Related papers (2024-08-09T17:05:45Z) - Efficient Verification-Based Face Identification [50.616875565173274]
We study the problem of performing face verification with an efficient neural model $f$.
Our model leads to a substantially small $f$ requiring only 23k parameters and 5M floating point operations (FLOPS)
We use six face verification datasets to demonstrate that our method is on par or better than state-of-the-art models.
arXiv Detail & Related papers (2023-12-20T18:08:02Z) - Differentially Private Reward Estimation with Preference Feedback [15.943664678210146]
Learning from preference-based feedback has recently gained considerable traction as a promising approach to align generative models with human interests.
An adversarial attack in any step of the above pipeline might reveal private and sensitive information of human labelers.
We focus on the problem of reward estimation from preference-based feedback while protecting privacy of each individual labelers.
arXiv Detail & Related papers (2023-10-30T16:58:30Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - Pre-trained Perceptual Features Improve Differentially Private Image
Generation [8.659595986100738]
Training even moderately-sized generative models with differentially-private descent gradient (DP-SGD) is difficult.
We advocate building off a good, relevant representation on an informative public dataset, then learning to model the private data with that representation.
Our work introduces simple yet powerful foundations for reducing the gap between private and non-private deep generative models.
arXiv Detail & Related papers (2022-05-25T16:46:01Z) - Tight Differential Privacy Guarantees for the Shuffle Model with $k$-Randomized Response [6.260747047974035]
Most differentially private (DP) algorithms assume a third party inserts noise to queries made on datasets, or a local model where the users locally perturb their data.
The recently proposed shuffle model is an intermediate framework between the central and the local paradigms.
We perform experiments on both synthetic and real data to compare the privacy-utility trade-off of the shuffle model with that of the central one privatized.
arXiv Detail & Related papers (2022-05-18T10:44:28Z) - Shuffle Private Linear Contextual Bandits [9.51828574518325]
We propose a general framework for linear contextual bandits under the shuffle algorithmic trust model.
We prove that both instantiations lead to regret guarantees that significantly improve on that of the local model.
We also verify this regret behavior with simulations on synthetic data.
arXiv Detail & Related papers (2022-02-11T11:53:22Z) - On the Intrinsic Differential Privacy of Bagging [69.70602220716718]
We show that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
Our experimental results demonstrate that Bagging achieves significantly higher accuracies than state-of-the-art differentially private machine learning methods with the same privacy budgets.
arXiv Detail & Related papers (2020-08-22T14:17:55Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - Input Perturbation: A New Paradigm between Central and Local
Differential Privacy [15.943736378291154]
We study the textitinput perturbation method in differentially private empirical risk minimization (DP-ERM)
We achieve ($epsilon$,$delta$)-differential privacy on the final model, along with some kind of privacy on the original data.
Our method achieves almost the same (or even better) performance as some of the best previous central methods with more protections on privacy.
arXiv Detail & Related papers (2020-02-20T05:20:02Z) - On the Difference Between the Information Bottleneck and the Deep
Information Bottleneck [81.89141311906552]
We revisit the Deep Variational Information Bottleneck and the assumptions needed for its derivation.
We show how to circumvent this limitation by optimising a lower bound for $I(T;Y)$ for which only the latter Markov chain has to be satisfied.
arXiv Detail & Related papers (2019-12-31T18:31:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.