A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem
- URL: http://arxiv.org/abs/2409.09558v2
- Date: Tue, 29 Oct 2024 00:30:58 GMT
- Title: A Statistical Viewpoint on Differential Privacy: Hypothesis Testing, Representation and Blackwell's Theorem
- Authors: Weijie J. Su,
- Abstract summary: We argue that differential privacy can be considered a textitpure statistical concept.
$f$-differential privacy is a unified framework for analyzing privacy bounds in data analysis and machine learning.
- Score: 30.365274034429508
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differential privacy is widely considered the formal privacy for privacy-preserving data analysis due to its robust and rigorous guarantees, with increasingly broad adoption in public services, academia, and industry. Despite originating in the cryptographic context, in this review paper we argue that, fundamentally, differential privacy can be considered a \textit{pure} statistical concept. By leveraging David Blackwell's informativeness theorem, our focus is to demonstrate based on prior work that all definitions of differential privacy can be formally motivated from a hypothesis testing perspective, thereby showing that hypothesis testing is not merely convenient but also the right language for reasoning about differential privacy. This insight leads to the definition of $f$-differential privacy, which extends other differential privacy definitions through a representation theorem. We review techniques that render $f$-differential privacy a unified framework for analyzing privacy bounds in data analysis and machine learning. Applications of this differential privacy definition to private deep learning, private convex optimization, shuffled mechanisms, and U.S.\ Census data are discussed to highlight the benefits of analyzing privacy bounds under this framework compared to existing alternatives.
Related papers
- Differential Privacy Overview and Fundamental Techniques [63.0409690498569]
This chapter is meant to be part of the book "Differential Privacy in Artificial Intelligence: From Theory to Practice"
It starts by illustrating various attempts to protect data privacy, emphasizing where and why they failed.
It then defines the key actors, tasks, and scopes that make up the domain of privacy-preserving data analysis.
arXiv Detail & Related papers (2024-11-07T13:52:11Z) - Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Models Matter: Setting Accurate Privacy Expectations for Local and Central Differential Privacy [14.40391109414476]
We design and evaluate new explanations of differential privacy for the local and central models.
We find that consequences-focused explanations in the style of privacy nutrition labels are a promising approach for setting accurate privacy expectations.
arXiv Detail & Related papers (2024-08-16T01:21:57Z) - Privacy Against Hypothesis-Testing Adversaries for Quantum Computing [14.095523601311374]
This paper presents a novel definition for data privacy in quantum computing based on quantum hypothesis testing.
The relationship between privacy against hypothesis-testing adversaries, defined in this paper, and quantum differential privacy are then examined.
arXiv Detail & Related papers (2023-02-24T02:10:27Z) - Privacy and Bias Analysis of Disclosure Avoidance Systems [45.645473465606564]
Disclosure avoidance (DA) systems are used to safeguard the confidentiality of data while allowing it to be analyzed and disseminated for analytic purposes.
This paper presents a framework that addresses this gap: it proposes differentially private versions of these mechanisms and derives their privacy bounds.
The results show that, contrary to popular beliefs, traditional differential privacy techniques may be superior in terms of accuracy and fairness to differential private counterparts of widely used DA mechanisms.
arXiv Detail & Related papers (2023-01-28T13:58:25Z) - Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis
Testing: A Lesson From Fano [83.5933307263932]
We study data reconstruction attacks for discrete data and analyze it under the framework of hypothesis testing.
We show that if the underlying private data takes values from a set of size $M$, then the target privacy parameter $epsilon$ can be $O(log M)$ before the adversary gains significant inferential power.
arXiv Detail & Related papers (2022-10-24T23:50:12Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - Debugging Differential Privacy: A Case Study for Privacy Auditing [60.87570714269048]
We show that auditing can also be used to find flaws in (purportedly) differentially private schemes.
In this case study, we audit a recent open source implementation of a differentially private deep learning algorithm and find, with 99.99999999% confidence, that the implementation does not satisfy the claimed differential privacy guarantee.
arXiv Detail & Related papers (2022-02-24T17:31:08Z) - Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions.
We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z) - Auditing Differentially Private Machine Learning: How Private is Private
SGD? [16.812900569416062]
We investigate whether Differentially Private SGD offers better privacy in practice than what is guaranteed by its state-of-the-art analysis.
We do so via novel data poisoning attacks, which we show correspond to realistic privacy attacks.
arXiv Detail & Related papers (2020-06-13T20:00:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.