Conciliating Privacy and Utility in Data Releases via Individual Differential Privacy and Microaggregation
- URL: http://arxiv.org/abs/2312.13712v1
- Date: Thu, 21 Dec 2023 10:23:18 GMT
- Title: Conciliating Privacy and Utility in Data Releases via Individual Differential Privacy and Microaggregation
- Authors: Jordi Soria-Comas, David Sánchez, Josep Domingo-Ferrer, Sergio Martínez, Luis Del Vasto-Terrientes,
- Abstract summary: $epsilon$-Differential privacy (DP) is a well-known privacy model that offers strong privacy guarantees.
We propose $epsilon$-individual differential privacy (iDP), which causes less data distortion while providing the same protection as DP to subjects.
We report on experiments that show how our approach can provide strong privacy (small $epsilon$) while yielding protected data that do not significantly degrade the accuracy of secondary data analysis.
- Score: 4.287502453001108
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: $\epsilon$-Differential privacy (DP) is a well-known privacy model that offers strong privacy guarantees. However, when applied to data releases, DP significantly deteriorates the analytical utility of the protected outcomes. To keep data utility at reasonable levels, practical applications of DP to data releases have used weak privacy parameters (large $\epsilon$), which dilute the privacy guarantees of DP. In this work, we tackle this issue by using an alternative formulation of the DP privacy guarantees, named $\epsilon$-individual differential privacy (iDP), which causes less data distortion while providing the same protection as DP to subjects. We enforce iDP in data releases by relying on attribute masking plus a pre-processing step based on data microaggregation. The goal of this step is to reduce the sensitivity to record changes, which determines the amount of noise required to enforce iDP (and DP). Specifically, we propose data microaggregation strategies designed for iDP whose sensitivities are significantly lower than those used in DP. As a result, we obtain iDP-protected data with significantly better utility than with DP. We report on experiments that show how our approach can provide strong privacy (small $\epsilon$) while yielding protected data that do not significantly degrade the accuracy of secondary data analysis.
Related papers
- Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy [55.357715095623554]
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties.
We propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification.
arXiv Detail & Related papers (2024-10-24T03:39:55Z) - Noise Variance Optimization in Differential Privacy: A Game-Theoretic Approach Through Per-Instance Differential Privacy [7.264378254137811]
Differential privacy (DP) can measure privacy loss by observing the changes in the distribution caused by the inclusion of individuals in the target dataset.
DP has been prominent in safeguarding datasets in machine learning in industry giants like Apple and Google.
We propose per-instance DP (pDP) as a constraint, measuring privacy loss for each data instance and optimizing noise tailored to individual instances.
arXiv Detail & Related papers (2024-04-24T06:51:16Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Personalized Differential Privacy for Ridge Regression [3.4751583941317166]
We introduce our novel Personalized-DP Output Perturbation method ( PDP-OP) that enables to train Ridge regression models with individual per data point privacy levels.
We provide rigorous privacy proofs for our PDP-OP as well as accuracy guarantees for the resulting model.
We show that PDP-OP outperforms the personalized privacy techniques of Jorgensen et al.
arXiv Detail & Related papers (2024-01-30T16:00:14Z) - Gradients Look Alike: Sensitivity is Often Overestimated in DP-SGD [44.11069254181353]
We show that DP-SGD leaks significantly less privacy for many datapoints when trained on common benchmarks.
This implies privacy attacks will necessarily fail against many datapoints if the adversary does not have sufficient control over the possible training datasets.
arXiv Detail & Related papers (2023-07-01T11:51:56Z) - Personalized DP-SGD using Sampling Mechanisms [5.50042037663784]
We extend Differentially Private Gradient Descent (DP-SGD) to support a recent privacy notion called ($Phi$,$Delta$)- Personalized Differential Privacy (($Phi$,$Delta$)- PDP.
Our algorithm uses a multi-round personalized sampling mechanism and embeds it within the DP-SGD iteration.
Experiments on real datasets show that our algorithm outperforms DP-SGD and simple combinations of DP-SGD with existing PDP mechanisms.
arXiv Detail & Related papers (2023-05-24T13:56:57Z) - A Randomized Approach for Tight Privacy Accounting [63.67296945525791]
We propose a new differential privacy paradigm called estimate-verify-release (EVR)
EVR paradigm first estimates the privacy parameter of a mechanism, then verifies whether it meets this guarantee, and finally releases the query output.
Our empirical evaluation shows the newly proposed EVR paradigm improves the utility-privacy tradeoff for privacy-preserving machine learning.
arXiv Detail & Related papers (2023-04-17T00:38:01Z) - How Do Input Attributes Impact the Privacy Loss in Differential Privacy? [55.492422758737575]
We study the connection between the per-subject norm in DP neural networks and individual privacy loss.
We introduce a novel metric termed the Privacy Loss-Input Susceptibility (PLIS) which allows one to apportion the subject's privacy loss to their input attributes.
arXiv Detail & Related papers (2022-11-18T11:39:03Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - DTGAN: Differential Private Training for Tabular GANs [6.174448419090292]
We propose DTGAN, a novel conditional Wasserstein GAN that comes in two variants DTGAN_G and DTGAN_D.
We rigorously evaluate the theoretical privacy guarantees offered by DP empirically against membership and attribute inference attacks.
Our results on 3 datasets show that the DP-SGD framework is superior to PATE and that a DP discriminator is more optimal for training convergence.
arXiv Detail & Related papers (2021-07-06T10:28:05Z) - Smoothed Differential Privacy [55.415581832037084]
Differential privacy (DP) is a widely-accepted and widely-applied notion of privacy based on worst-case analysis.
In this paper, we propose a natural extension of DP following the worst average-case idea behind the celebrated smoothed analysis.
We prove that any discrete mechanism with sampling procedures is more private than what DP predicts, while many continuous mechanisms with sampling procedures are still non-private under smoothed DP.
arXiv Detail & Related papers (2021-07-04T06:55:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.