Differentially-Private Bayes Consistency
- URL: http://arxiv.org/abs/2212.04216v1
- Date: Thu, 8 Dec 2022 11:57:30 GMT
- Title: Differentially-Private Bayes Consistency
- Authors: Olivier Bousquet, Haim Kaplan, Aryeh Kontorovich, Yishay Mansour, Shay
Moran, Menachem Sadigurschi, Uri Stemmer
- Abstract summary: We construct a Bayes consistent learning rule that satisfies differential privacy (DP)
We prove that any VC class can be privately learned in a semi-supervised setting with a near-optimal sample complexity.
- Score: 70.92545332158217
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We construct a universally Bayes consistent learning rule that satisfies
differential privacy (DP). We first handle the setting of binary classification
and then extend our rule to the more general setting of density estimation
(with respect to the total variation metric). The existence of a universally
consistent DP learner reveals a stark difference with the distribution-free PAC
model. Indeed, in the latter DP learning is extremely limited: even
one-dimensional linear classifiers are not privately learnable in this
stringent model. Our result thus demonstrates that by allowing the learning
rate to depend on the target distribution, one can circumvent the
above-mentioned impossibility result and in fact, learn \emph{arbitrary}
distributions by a single DP algorithm. As an application, we prove that any VC
class can be privately learned in a semi-supervised setting with a near-optimal
\emph{labeled} sample complexity of $\tilde{O}(d/\varepsilon)$ labeled examples
(and with an unlabeled sample complexity that can depend on the target
distribution).
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Collaborative Learning with Different Labeling Functions [7.228285747845779]
We study a variant of Collaborative PAC Learning, in which we aim to learn an accurate classifier for each of the $n$ data distributions.
We show that, when the data distributions satisfy a weaker realizability assumption, sample-efficient learning is still feasible.
arXiv Detail & Related papers (2024-02-16T04:32:22Z) - Out-Of-Domain Unlabeled Data Improves Generalization [0.7589678255312519]
We propose a novel framework for incorporating unlabeled data into semi-supervised classification problems.
We show that unlabeled samples can be harnessed to narrow the generalization gap.
We validate our claims through experiments conducted on a variety of synthetic and real-world datasets.
arXiv Detail & Related papers (2023-09-29T02:00:03Z) - Learning versus Refutation in Noninteractive Local Differential Privacy [133.80204506727526]
We study two basic statistical tasks in non-interactive local differential privacy (LDP): learning and refutation.
Our main result is a complete characterization of the sample complexity of PAC learning for non-interactive LDP protocols.
arXiv Detail & Related papers (2022-10-26T03:19:24Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - What You See is What You Get: Distributional Generalization for
Algorithm Design in Deep Learning [12.215964287323876]
We investigate and leverage a connection between Differential Privacy (DP) and the notion of Distributional Generalization (DG)
We introduce new conceptual tools for designing deep-learning methods that bypass "pathologies" of standard gradient descent (SGD)
arXiv Detail & Related papers (2022-04-07T05:41:40Z) - Self-Supervised Learning by Estimating Twin Class Distributions [26.7828253129684]
We present TWIST, a novel self-supervised representation learning method by classifying large-scale unlabeled datasets in an end-to-end way.
We employ a siamese network terminated by a softmax operation to produce twin class distributions of two augmented images.
Specifically, we minimize the entropy of the distribution for each sample to make the class prediction for each sample and maximize the entropy of the mean distribution to make the predictions of different samples diverse.
arXiv Detail & Related papers (2021-10-14T14:39:39Z) - On the Practicality of Differential Privacy in Federated Learning by
Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively.
Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks.
Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.