Calibrating Noise for Group Privacy in Subsampled Mechanisms
- URL: http://arxiv.org/abs/2408.09943v2
- Date: Sat, 24 Aug 2024 13:00:41 GMT
- Title: Calibrating Noise for Group Privacy in Subsampled Mechanisms
- Authors: Yangfan Jiang, Xinjian Luo, Yin Yang, Xiaokui Xiao,
- Abstract summary: Group privacy (GP) is capable of protecting sensitive aggregate information of a group of up to m individuals.
GP is often treated as an afterthought, with most approaches treating it as a black box.
We propose a novel analysis framework that provides tight privacy accounting for subsampled GP mechanisms.
- Score: 24.518597984169734
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Given a group size m and a sensitive dataset D, group privacy (GP) releases information about D with the guarantee that the adversary cannot infer with high confidence whether the underlying data is D or a neighboring dataset D' that differs from D by m records. GP generalizes the well-established notion of differential privacy (DP) for protecting individuals' privacy; in particular, when m=1, GP reduces to DP. Compared to DP, GP is capable of protecting the sensitive aggregate information of a group of up to m individuals, e.g., the average annual income among members of a yacht club. Despite its longstanding presence in the research literature and its promising applications, GP is often treated as an afterthought, with most approaches first developing a DP mechanism and then using a generic conversion to adapt it for GP, treating the DP solution as a black box. As we point out in the paper, this methodology is suboptimal when the underlying DP solution involves subsampling, e.g., in the classic DP-SGD method for training deep learning models. In this case, the DP-to-GP conversion is overly pessimistic in its analysis, leading to low utility in the published results under GP. Motivated by this, we propose a novel analysis framework that provides tight privacy accounting for subsampled GP mechanisms. Instead of converting a black-box DP mechanism to GP, our solution carefully analyzes and utilizes the inherent randomness in subsampled mechanisms, leading to a substantially improved bound on the privacy loss with respect to GP. The proposed solution applies to a wide variety of foundational mechanisms with subsampling. Extensive experiments with real datasets demonstrate that compared to the baseline convert-from-blackbox-DP approach, our GP mechanisms achieve noise reductions of over an order of magnitude in several practical settings, including deep neural network training.
Related papers
- Noise-Aware Differentially Private Regression via Meta-Learning [25.14514068630219]
Differential Privacy (DP) is the gold standard for protecting user privacy, but standard DP mechanisms significantly impair performance.
One approach to mitigating this issue is pre-training models on simulated data before DP learning on the private data.
In this work we go a step further, using simulated data to train a meta-learning model that combines the Convolutional Conditional Neural Process (ConvCNP) with an improved functional DP mechanism.
arXiv Detail & Related papers (2024-06-12T18:11:24Z) - How Private are DP-SGD Implementations? [61.19794019914523]
We show that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
Our result shows that there can be a substantial gap between the privacy analysis when using the two types of batch sampling.
arXiv Detail & Related papers (2024-03-26T13:02:43Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Domain Invariant Learning for Gaussian Processes and Bayesian
Exploration [39.83530605880014]
We propose a domain invariant learning algorithm for Gaussian processes (DIL-GP) with a min-max optimization on the likelihood.
Numerical experiments demonstrate the superiority of DIL-GP for predictions on several synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-18T16:13:34Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - DPIS: An Enhanced Mechanism for Differentially Private SGD with Importance Sampling [23.8561225168394]
differential privacy (DP) has become a well-accepted standard for privacy protection, and deep neural networks (DNN) have been immensely successful in machine learning.
A classic mechanism for this purpose is DP-SGD, which is a differentially private version of the gradient descent (SGD) commonly used for training.
We propose DPIS, a novel mechanism for differentially private SGD training that can be used as a drop-in replacement of the core of DP-SGD.
arXiv Detail & Related papers (2022-10-18T07:03:14Z) - Differentially Private SGDA for Minimax Problems [83.57322009102973]
We prove that gradient descent ascent (SGDA) can achieve optimal utility in terms of weak primal-dual population risk.
This is the first-ever-known result for non-smoothly-strongly-concave setting.
arXiv Detail & Related papers (2022-01-22T13:05:39Z) - Non-Gaussian Gaussian Processes for Few-Shot Regression [71.33730039795921]
We propose an invertible ODE-based mapping that operates on each component of the random variable vectors and shares the parameters across all of them.
NGGPs outperform the competing state-of-the-art approaches on a diversified set of benchmarks and applications.
arXiv Detail & Related papers (2021-10-26T10:45:25Z) - Incremental Ensemble Gaussian Processes [53.3291389385672]
We propose an incremental ensemble (IE-) GP framework, where an EGP meta-learner employs an it ensemble of GP learners, each having a unique kernel belonging to a prescribed kernel dictionary.
With each GP expert leveraging the random feature-based approximation to perform online prediction and model update with it scalability, the EGP meta-learner capitalizes on data-adaptive weights to synthesize the per-expert predictions.
The novel IE-GP is generalized to accommodate time-varying functions by modeling structured dynamics at the EGP meta-learner and within each GP learner.
arXiv Detail & Related papers (2021-10-13T15:11:25Z) - Gaussian Processes with Differential Privacy [3.934224774675743]
We add strong privacy protection to Gaussian processes (GPs) via differential privacy (DP)
We achieve this by using sparse GP methodology and publishing a private variational approximation on known inducing points.
Our experiments demonstrate that given sufficient amount of data, the method can produce accurate models under strong privacy protection.
arXiv Detail & Related papers (2021-06-01T13:23:16Z) - Sparse Gaussian Process Variational Autoencoders [24.86751422740643]
Existing approaches for performing inference in GP-DGMs do not support sparse GP approximations based on points.
We develop the sparse Gaussian processal variation autoencoder (GP-VAE) characterised by the use of partial inference networks for parameterising sparse GP approximations.
arXiv Detail & Related papers (2020-10-20T10:19:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.