Related papers: Certificates of Differential Privacy and Unlearning for Gradient-Based Training

Certificates of Differential Privacy and Unlearning for Gradient-Based Training

URL: http://arxiv.org/abs/2406.13433v1
Date: Wed, 19 Jun 2024 10:47:00 GMT
Title: Certificates of Differential Privacy and Unlearning for Gradient-Based Training
Authors: Matthew Wicker, Philip Sosnin, Adrianna Janik, Mark N. Müller, Adrian Weller, Calvin Tsay,
Abstract summary: We propose a new verification-centric approach to privacy and unlearning guarantees. Our framework offers a new verification-centric approach to privacy and unlearning guarantees. We validate the effectiveness of our approach on tasks from financial services, medical imaging, and natural language processing.
Score: 35.18220120875603
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Proper data stewardship requires that model owners protect the privacy of individuals' data used during training. Whether through anonymization with differential privacy or the use of unlearning in non-anonymized settings, the gold-standard techniques for providing privacy guarantees can come with significant performance penalties or be too weak to provide practical assurances. In part, this is due to the fact that the guarantee provided by differential privacy represents the worst-case privacy leakage for any individual, while the true privacy leakage of releasing the prediction for a given individual might be substantially smaller or even, as we show, non-existent. This work provides a novel framework based on convex relaxations and bounds propagation that can compute formal guarantees (certificates) that releasing specific predictions satisfies $\epsilon=0$ privacy guarantees or do not depend on data that is subject to an unlearning request. Our framework offers a new verification-centric approach to privacy and unlearning guarantees, that can be used to further engender user trust with tighter privacy guarantees, provide formal proofs of robustness to certain membership inference attacks, identify potentially vulnerable records, and enhance current unlearning approaches. We validate the effectiveness of our approach on tasks from financial services, medical imaging, and natural language processing.

Related papers

Federated Learning with Differential Privacy: An Utility-Enhanced Approach [12.614480013684759]
Federated learning has emerged as an attractive approach to protect data privacy by eliminating the need for sharing clients' data.<n>Recent studies have shown that federated learning alone does not guarantee privacy, as private data may still be inferred from the uploaded parameters to the central server.<n>We present a modification to these vanilla differentially private algorithms based on a Haar wavelet transformation step and a novel noise injection scheme that significantly lowers the bound of the noise variance.
arXiv Detail & Related papers (2025-03-27T04:48:29Z)
Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.<n>Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.<n>We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Estimating the Conformal Prediction Threshold from Noisy Labels [22.841631892273547]
We show how we can estimate the noise-free conformal threshold based on the noisy labeled data. We dub our approach Noise-Aware Conformal Prediction (NACP) and show on several natural and medical image classification datasets.
arXiv Detail & Related papers (2025-01-22T09:35:58Z)
$(ε, δ)$-Differentially Private Partial Least Squares Regression [1.8666451604540077]
We propose an $(epsilon, delta)$-differentially private PLS (edPLS) algorithm to ensure the privacy of the data underlying the model. Experimental results demonstrate that edPLS effectively renders privacy attacks, aimed at recovering unique sources of variability in the training data.
arXiv Detail & Related papers (2024-12-12T10:49:55Z)
Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.<n>We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z)
Privacy-Preserving Set-Based Estimation Using Differential Privacy and Zonotopes [2.206168301581203]
For large-scale cyber-physical systems, the collaboration of spatially distributed sensors is often needed to perform the state estimation process. Privacy concerns arise from disclosing sensitive measurements to a cloud estimator. We propose a differentially private set-based estimation protocol that guarantees true state containment in the estimated set and differential privacy for the sensitive measurements.
arXiv Detail & Related papers (2024-08-30T13:05:38Z)
Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets. We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
Adaptive Differential Privacy in Federated Learning: A Priority-Based Approach [0.0]
Federated learning (FL) develops global models without direct access to local datasets. DP offers a framework that gives a privacy guarantee by adding certain amounts of noise to parameters. We propose adaptive noise addition in FL which decides the value of injected noise based on features' relative importance.
arXiv Detail & Related papers (2024-01-04T03:01:15Z)
Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks [72.51255282371805]
We prove a privacy bound for the KL divergence between model distributions on worst-case neighboring datasets. We find that this KL privacy bound is largely determined by the expected squared gradient norm relative to model parameters during training.
arXiv Detail & Related papers (2023-10-31T16:13:22Z)
DP-SGD with weight clipping [0.9208007322096533]
We present a novel approach that mitigates the bias arising from traditional gradient clipping.<n>By leveraging a public upper bound of the Lipschitz value of the current model and its current location within the search domain, we can achieve refined noise level adjustments.
arXiv Detail & Related papers (2023-10-27T09:17:15Z)
Conditional Density Estimations from Privacy-Protected Data [0.0]
We propose simulation-based inference methods from privacy-protected datasets. We illustrate our methods on discrete time-series data under an infectious disease model and with ordinary linear regression models.
arXiv Detail & Related papers (2023-10-19T14:34:17Z)
Improving the Variance of Differentially Private Randomized Experiments through Clustering [16.166525280886578]
We propose a new differentially private mechanism, "Cluster-DP"<n>We demonstrate that selecting higher-quality clusters, according to a quality metric we introduce, can decrease the variance penalty without compromising privacy guarantees.
arXiv Detail & Related papers (2023-08-02T05:51:57Z)
Propagating Variational Model Uncertainty for Bioacoustic Call Label Smoothing [15.929064190849665]
We focus on using the predictive uncertainty signal calculated by Bayesian neural networks to guide learning in the self-same task the model is being trained on. Not opting for costly Monte Carlo sampling of weights, we propagate the approximate hidden variance in an end-to-end manner. We show that, through the explicit usage of the uncertainty in the loss calculation, the variational model is led to improved predictive and calibration performance.
arXiv Detail & Related papers (2022-10-19T13:04:26Z)
Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z)
Brownian Noise Reduction: Maximizing Privacy Subject to Accuracy Constraints [53.01656650117495]
There is a disconnect between how researchers and practitioners handle privacy-utility tradeoffs. Brownian mechanism works by first adding Gaussian noise of high variance corresponding to the final point of a simulated Brownian motion. We complement our Brownian mechanism with ReducedAboveThreshold, a generalization of the classical AboveThreshold algorithm.
arXiv Detail & Related papers (2022-06-15T01:43:37Z)
Private Prediction Sets [72.75711776601973]
Machine learning systems need reliable uncertainty quantification and protection of individuals' privacy. We present a framework that treats these two desiderata jointly. We evaluate the method on large-scale computer vision datasets.
arXiv Detail & Related papers (2021-02-11T18:59:11Z)
Robustness Threats of Differential Privacy [70.818129585404]
We experimentally demonstrate that networks, trained with differential privacy, in some settings might be even more vulnerable in comparison to non-private versions. We study how the main ingredients of differentially private neural networks training, such as gradient clipping and noise addition, affect the robustness of the model.
arXiv Detail & Related papers (2020-12-14T18:59:24Z)
Privacy Preserving Recalibration under Domain Shift [119.21243107946555]
We introduce a framework that abstracts out the properties of recalibration problems under differential privacy constraints. We also design a novel recalibration algorithm, accuracy temperature scaling, that outperforms prior work on private datasets.
arXiv Detail & Related papers (2020-08-21T18:43:37Z)
RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection. However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information. We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.