Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning
- URL: http://arxiv.org/abs/2503.21528v1
- Date: Thu, 27 Mar 2025 14:17:05 GMT
- Title: Bayesian Pseudo Posterior Mechanism for Differentially Private Machine Learning
- Authors: Robert Chew, Matthew R. Williams, Elan A. Segarra, Alexander J. Preiss, Amanda Konet, Terrance D. Savitsky,
- Abstract summary: Differential privacy (DP) is becoming increasingly important for deployed machine learning applications.<n>We propose a new scalable DP mechanism for deep learning models, SWAG-PPM, by using a pseudo posterior distribution.<n>We find that SWAG-PPM exhibits only modest utility degradation against a non-private comparator while greatly outperforming the industry standard DP-SGD for a similar privacy budget.
- Score: 38.068924764486255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differential privacy (DP) is becoming increasingly important for deployed machine learning applications because it provides strong guarantees for protecting the privacy of individuals whose data is used to train models. However, DP mechanisms commonly used in machine learning tend to struggle on many real world distributions, including highly imbalanced or small labeled training sets. In this work, we propose a new scalable DP mechanism for deep learning models, SWAG-PPM, by using a pseudo posterior distribution that downweights by-record likelihood contributions proportionally to their disclosure risks as the randomized mechanism. As a motivating example from official statistics, we demonstrate SWAG-PPM on a workplace injury text classification task using a highly imbalanced public dataset published by the U.S. Occupational Safety and Health Administration (OSHA). We find that SWAG-PPM exhibits only modest utility degradation against a non-private comparator while greatly outperforming the industry standard DP-SGD for a similar privacy budget.
Related papers
- Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - DMM: Distributed Matrix Mechanism for Differentially-Private Federated Learning using Packed Secret Sharing [51.336015600778396]
Federated Learning (FL) has gained lots of traction recently, both in industry and academia.
In FL, a machine learning model is trained using data from various end-users arranged in committees across several rounds.
Since such data can often be sensitive, a primary challenge in FL is providing privacy while still retaining utility of the model.
arXiv Detail & Related papers (2024-10-21T16:25:14Z) - Privacy Amplification for the Gaussian Mechanism via Bounded Support [64.86780616066575]
Data-dependent privacy accounting frameworks such as per-instance differential privacy (pDP) and Fisher information loss (FIL) confer fine-grained privacy guarantees for individuals in a fixed training dataset.
We propose simple modifications of the Gaussian mechanism with bounded support, showing that they amplify privacy guarantees under data-dependent accounting.
arXiv Detail & Related papers (2024-03-07T21:22:07Z) - Unlocking Accuracy and Fairness in Differentially Private Image
Classification [43.53494043189235]
Differential privacy (DP) is considered the gold standard framework for privacy-preserving training.
We show that pre-trained foundation models fine-tuned with DP can achieve similar accuracy to non-private classifiers.
arXiv Detail & Related papers (2023-08-21T17:42:33Z) - DiVa: An Accelerator for Differentially Private Machine Learning [1.054627611890905]
Differential privacy (DP) is rapidly gaining momentum in the industry as a practical standard for privacy protection.
We conduct a detailed workload characterization on a state-of-the-art differentially private ML training algorithm named DP-SGD.
Based on our analysis, we propose an accelerator for differentially private ML named DiVa, which provides a significant improvement in compute utilization.
arXiv Detail & Related papers (2022-08-26T01:19:56Z) - DP$^2$-VAE: Differentially Private Pre-trained Variational Autoencoders [26.658723213776632]
We propose DP$2$-VAE, a training mechanism for variational autoencoders (VAE) with provable DP guarantees and improved utility via emphpre-training on private data.
We conduct extensive experiments on image datasets to illustrate our superiority over baselines under various privacy budgets and evaluation metrics.
arXiv Detail & Related papers (2022-08-05T23:57:34Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - DP-SGD vs PATE: Which Has Less Disparate Impact on Model Accuracy? [1.3238373064156095]
We show that application of differential privacy, specifically the DP-SGD algorithm, has a disparate impact on different sub-groups in the population.
We compare PATE, another mechanism for training deep learning models using differential privacy, with DP-SGD in terms of fairness.
arXiv Detail & Related papers (2021-06-22T20:37:12Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.