DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in
Machine Learning
- URL: http://arxiv.org/abs/2112.12998v1
- Date: Fri, 24 Dec 2021 08:40:28 GMT
- Title: DP-UTIL: Comprehensive Utility Analysis of Differential Privacy in
Machine Learning
- Authors: Ismat Jarin and Birhanu Eshete
- Abstract summary: Differential Privacy (DP) has emerged as a rigorous formalism to reason about privacy leakage.
In machine learning (ML), DP has been employed to limit/disclosure of training examples.
For deep neural networks, gradient perturbation results in lowest privacy leakage.
- Score: 3.822543555265593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differential Privacy (DP) has emerged as a rigorous formalism to reason about
quantifiable privacy leakage. In machine learning (ML), DP has been employed to
limit inference/disclosure of training examples. Prior work leveraged DP across
the ML pipeline, albeit in isolation, often focusing on mechanisms such as
gradient perturbation. In this paper, we present, DP-UTIL, a holistic utility
analysis framework of DP across the ML pipeline with focus on input
perturbation, objective perturbation, gradient perturbation, output
perturbation, and prediction perturbation. Given an ML task on
privacy-sensitive data, DP-UTIL enables a ML privacy practitioner perform
holistic comparative analysis on the impact of DP in these five perturbation
spots, measured in terms of model utility loss, privacy leakage, and the number
of truly revealed training samples. We evaluate DP-UTIL over classification
tasks on vision, medical, and financial datasets, using two representative
learning algorithms (logistic regression and deep neural network) against
membership inference attack as a case study attack. One of the highlights of
our results is that prediction perturbation consistently achieves the lowest
utility loss on all models across all datasets. In logistic regression models,
objective perturbation results in lowest privacy leakage compared to other
perturbation techniques. For deep neural networks, gradient perturbation
results in lowest privacy leakage. Moreover, our results on true revealed
records suggest that as privacy leakage increases a differentially private
model reveals more number of member samples. Overall, our findings suggest that
to make informed decisions as to which perturbation mechanism to use, a ML
privacy practitioner needs to examine the dynamics between optimization
techniques (convex vs. non-convex), perturbation mechanisms, number of classes,
and privacy budget.
Related papers
- Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training [31.559864332056648]
We propose a generic differential privacy framework with heterogeneous noise (DP-Hero)
Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradient updates is heterogeneous and guided by prior-established model parameters.
We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works.
arXiv Detail & Related papers (2024-09-05T08:40:54Z) - Explainable Differential Privacy-Hyperdimensional Computing for Balancing Privacy and Transparency in Additive Manufacturing Monitoring [5.282482641822561]
Differential Privacy (DP) adds mathematically controlled noise to Machine Learning (ML) models.
This study presents the Differential Privacy-Hyperdimensional Computing (DP-HD) framework to quantify noise effects on accuracy.
Experimental results show DP-HD achieves superior operational efficiency, prediction accuracy, and privacy protection.
arXiv Detail & Related papers (2024-07-09T17:42:26Z) - Approximating Two-Layer ReLU Networks for Hidden State Analysis in Differential Privacy [3.8254443661593633]
We show that it is possible to privately train convex problems with privacy-utility trade-offs comparable to those of one hidden-layer ReLU networks trained with DP-SGD.
Our experiments on benchmark classification tasks show that NoisyCGD can achieve privacy-utility trade-offs comparable to DP-SGD applied to one-hidden-layer ReLU networks.
arXiv Detail & Related papers (2024-07-05T22:43:32Z) - Initialization Matters: Privacy-Utility Analysis of Overparameterized
Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets.
We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z) - A Differentially Private Weighted Empirical Risk Minimization Procedure and its Application to Outcome Weighted Learning [4.322221694511603]
Differential privacy (DP) is an appealing framework for addressing data privacy issues.
DP provides mathematically provable bounds on the privacy loss incurred when releasing information from sensitive data.
We propose the first differentially private algorithm for general wERM, with theoretical DP guarantees.
arXiv Detail & Related papers (2023-07-24T21:03:25Z) - Amplitude-Varying Perturbation for Balancing Privacy and Utility in
Federated Learning [86.08285033925597]
This paper presents a new DP perturbation mechanism with a time-varying noise amplitude to protect the privacy of federated learning.
We derive an online refinement of the series to prevent FL from premature convergence resulting from excessive perturbation noise.
The contribution of the new DP mechanism to the convergence and accuracy of privacy-preserving FL is corroborated, compared to the state-of-the-art Gaussian noise mechanism with a persistent noise amplitude.
arXiv Detail & Related papers (2023-03-07T22:52:40Z) - A Differentially Private Framework for Deep Learning with Convexified
Loss Functions [4.059849656394191]
Differential privacy (DP) has been applied in deep learning for preserving privacy of the underlying training sets.
Existing DP practice falls into three categories - objective perturbation, gradient perturbation and output perturbation.
We propose a novel output perturbation framework by injecting DP noise into a randomly sampled neuron.
arXiv Detail & Related papers (2022-04-03T11:10:05Z) - Sensitivity analysis in differentially private machine learning using
hybrid automatic differentiation [54.88777449903538]
We introduce a novel textithybrid automatic differentiation (AD) system for sensitivity analysis.
This enables modelling the sensitivity of arbitrary differentiable function compositions, such as the training of neural networks on private data.
Our approach can enable the principled reasoning about privacy loss in the setting of data processing.
arXiv Detail & Related papers (2021-07-09T07:19:23Z) - Smoothed Differential Privacy [55.415581832037084]
Differential privacy (DP) is a widely-accepted and widely-applied notion of privacy based on worst-case analysis.
In this paper, we propose a natural extension of DP following the worst average-case idea behind the celebrated smoothed analysis.
We prove that any discrete mechanism with sampling procedures is more private than what DP predicts, while many continuous mechanisms with sampling procedures are still non-private under smoothed DP.
arXiv Detail & Related papers (2021-07-04T06:55:45Z) - On the Practicality of Differential Privacy in Federated Learning by
Tuning Iteration Times [51.61278695776151]
Federated Learning (FL) is well known for its privacy protection when training machine learning models among distributed clients collaboratively.
Recent studies have pointed out that the naive FL is susceptible to gradient leakage attacks.
Differential Privacy (DP) emerges as a promising countermeasure to defend against gradient leakage attacks.
arXiv Detail & Related papers (2021-01-11T19:43:12Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.