Differentially Private Sharpness-Aware Training
- URL: http://arxiv.org/abs/2306.05651v1
- Date: Fri, 9 Jun 2023 03:37:27 GMT
- Title: Differentially Private Sharpness-Aware Training
- Authors: Jinseong Park, Hoki Kim, Yujin Choi, Jaewook Lee
- Abstract summary: Training deep learning models with differential privacy (DP) results in a degradation of performance.
We show that flat minima can help reduce the negative effects of per-example gradient clipping.
We propose a new sharpness-aware training method that mitigates the privacy-optimization trade-off.
- Score: 5.488902352630076
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Training deep learning models with differential privacy (DP) results in a
degradation of performance. The training dynamics of models with DP show a
significant difference from standard training, whereas understanding the
geometric properties of private learning remains largely unexplored. In this
paper, we investigate sharpness, a key factor in achieving better
generalization, in private learning. We show that flat minima can help reduce
the negative effects of per-example gradient clipping and the addition of
Gaussian noise. We then verify the effectiveness of Sharpness-Aware
Minimization (SAM) for seeking flat minima in private learning. However, we
also discover that SAM is detrimental to the privacy budget and computational
time due to its two-step optimization. Thus, we propose a new sharpness-aware
training method that mitigates the privacy-optimization trade-off. Our
experimental results demonstrate that the proposed method improves the
performance of deep learning models with DP from both scratch and fine-tuning.
Code is available at https://github.com/jinseongP/DPSAT.
Related papers
- Towards Efficient and Scalable Training of Differentially Private Deep Learning [4.543581742916529]
Differentially private gradient descent (DP-SGD) is the standard algorithm for training machine learning models under differential privacy (DP)
The major drawback of DP-SGD is the drop in utility which prior work has comprehensively studied.
We conduct a comprehensive empirical study to quantify the computational cost of training deep learning models under DP and benchmark methods that aim at reducing the cost.
arXiv Detail & Related papers (2024-06-25T06:04:58Z) - Towards the Flatter Landscape and Better Generalization in Federated
Learning under Client-level Differential Privacy [67.33715954653098]
We propose a novel DPFL algorithm named DP-FedSAM, which leverages gradient perturbation to mitigate the negative impact of DP.
Specifically, DP-FedSAM integrates Sharpness Aware of Minimization (SAM) to generate local flatness models with stability and weight robustness.
To further reduce the magnitude random noise while achieving better performance, we propose DP-FedSAM-$top_k$ by adopting the local update sparsification technique.
arXiv Detail & Related papers (2023-05-01T15:19:09Z) - Enforcing Privacy in Distributed Learning with Performance Guarantees [57.14673504239551]
We study the privatization of distributed learning and optimization strategies.
We show that the popular additive random perturbation scheme degrades performance because it is not well-tuned to the graph structure.
arXiv Detail & Related papers (2023-01-16T13:03:27Z) - Sharpness-Aware Training for Free [163.1248341911413]
SharpnessAware Minimization (SAM) has shown that minimizing a sharpness measure, which reflects the geometry of the loss landscape, can significantly reduce the generalization error.
Sharpness-Aware Training Free (SAF) mitigates the sharp landscape at almost zero computational cost over the base.
SAF ensures the convergence to a flat minimum with improved capabilities.
arXiv Detail & Related papers (2022-05-27T16:32:43Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Large Language Models Can Be Strong Differentially Private Learners [70.0317718115406]
Differentially Private (DP) learning has seen limited success for building large deep learning models of text.
We show that this performance drop can be mitigated with the use of large pretrained models.
We propose a memory saving technique that allows clipping in DP-SGD to run without instantiating per-example gradients.
arXiv Detail & Related papers (2021-10-12T01:45:27Z) - DPlis: Boosting Utility of Differentially Private Deep Learning via
Randomized Smoothing [0.0]
We propose DPlis--Differentially Private Learning wIth Smoothing.
We show that DPlis can effectively boost model quality and training stability under a given privacy budget.
arXiv Detail & Related papers (2021-03-02T06:33:14Z) - Sharpness-Aware Minimization for Efficiently Improving Generalization [36.87818971067698]
We introduce a novel, effective procedure for simultaneously minimizing loss value and loss sharpness.
Sharpness-Aware Minimization (SAM) seeks parameters that lie in neighborhoods having uniformly low loss.
We present empirical results showing that SAM improves model generalization across a variety of benchmark datasets.
arXiv Detail & Related papers (2020-10-03T19:02:10Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.