No Free Lunch in "Privacy for Free: How does Dataset Condensation Help
Privacy"
- URL: http://arxiv.org/abs/2209.14987v1
- Date: Thu, 29 Sep 2022 17:50:23 GMT
- Title: No Free Lunch in "Privacy for Free: How does Dataset Condensation Help
Privacy"
- Authors: Nicholas Carlini and Vitaly Feldman and Milad Nasr
- Abstract summary: New methods designed to preserve data privacy require careful scrutiny.
Failure to preserve privacy is hard to detect, and yet can lead to catastrophic results when a system implementing a privacy-preserving'' method is attacked.
- Score: 75.98836424725437
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: New methods designed to preserve data privacy require careful scrutiny.
Failure to preserve privacy is hard to detect, and yet can lead to catastrophic
results when a system implementing a ``privacy-preserving'' method is attacked.
A recent work selected for an Outstanding Paper Award at ICML 2022 (Dong et
al., 2022) claims that dataset condensation (DC) significantly improves data
privacy when training machine learning models. This claim is supported by
theoretical analysis of a specific dataset condensation technique and an
empirical evaluation of resistance to some existing membership inference
attacks.
In this note we examine the claims in the work of Dong et al. (2022) and
describe major flaws in the empirical evaluation of the method and its
theoretical analysis. These flaws imply that their work does not provide
statistically significant evidence that DC improves the privacy of training ML
models over a naive baseline. Moreover, previously published results show that
DP-SGD, the standard approach to privacy preserving ML, simultaneously gives
better accuracy and achieves a (provably) lower membership attack success rate.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - The Data Minimization Principle in Machine Learning [61.17813282782266]
Data minimization aims to reduce the amount of data collected, processed or retained.
It has been endorsed by various global data protection regulations.
However, its practical implementation remains a challenge due to the lack of a rigorous formulation.
arXiv Detail & Related papers (2024-05-29T19:40:27Z) - Conditional Density Estimations from Privacy-Protected Data [0.0]
We propose simulation-based inference methods from privacy-protected datasets.
We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models.
arXiv Detail & Related papers (2023-10-19T14:34:17Z) - A Cautionary Tale: On the Role of Reference Data in Empirical Privacy
Defenses [12.34501903200183]
We propose a baseline defense that enables the utility-privacy tradeoff with respect to both training and reference data to be easily understood.
Our experiments show that, surprisingly, it outperforms the most well-studied and current state-of-the-art empirical privacy defenses.
arXiv Detail & Related papers (2023-10-18T17:07:07Z) - Re-thinking Data Availablity Attacks Against Deep Neural Networks [53.64624167867274]
In this paper, we re-examine the concept of unlearnable examples and discern that the existing robust error-minimizing noise presents an inaccurate optimization objective.
We introduce a novel optimization paradigm that yields improved protection results with reduced computational time requirements.
arXiv Detail & Related papers (2023-05-18T04:03:51Z) - A Randomized Approach for Tight Privacy Accounting [63.67296945525791]
We propose a new differential privacy paradigm called estimate-verify-release (EVR)
EVR paradigm first estimates the privacy parameter of a mechanism, then verifies whether it meets this guarantee, and finally releases the query output.
Our empirical evaluation shows the newly proposed EVR paradigm improves the utility-privacy tradeoff for privacy-preserving machine learning.
arXiv Detail & Related papers (2023-04-17T00:38:01Z) - Membership Inference Attacks against Synthetic Data through Overfitting
Detection [84.02632160692995]
We argue for a realistic MIA setting that assumes the attacker has some knowledge of the underlying data distribution.
We propose DOMIAS, a density-based MIA model that aims to infer membership by targeting local overfitting of the generative model.
arXiv Detail & Related papers (2023-02-24T11:27:39Z) - Privacy in Practice: Private COVID-19 Detection in X-Ray Images
(Extended Version) [3.750713193320627]
We create machine learning models that satisfy Differential Privacy (DP)
We evaluate the utility-privacy trade-off more extensively and over stricter privacy budgets.
Our results indicate that needed privacy levels might differ based on the task-dependent practical threat from MIAs.
arXiv Detail & Related papers (2022-11-21T13:22:29Z) - Privacy for Free: How does Dataset Condensation Help Privacy? [21.418263507735684]
We identify that dataset condensation (DC) is also a better solution to replace the traditional data generators for private data generation.
We empirically validate the visual privacy and membership privacy of DC-synthesized data by launching both the loss-based and the state-of-the-art likelihood-based membership inference attacks.
arXiv Detail & Related papers (2022-06-01T05:39:57Z) - Stratified cross-validation for unbiased and privacy-preserving
federated learning [0.0]
We focus on the recurrent problem of duplicated records that, if not handled properly, may cause over-optimistic estimations of a model's performances.
We introduce and discuss stratified cross-validation, a validation methodology that leverages stratification techniques to prevent data leakage in federated learning settings.
arXiv Detail & Related papers (2020-01-22T15:49:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.