R+R:Understanding Hyperparameter Effects in DP-SGD
- URL: http://arxiv.org/abs/2411.02051v1
- Date: Mon, 04 Nov 2024 12:56:35 GMT
- Title: R+R:Understanding Hyperparameter Effects in DP-SGD
- Authors: Felix Morsbach, Jan Reubold, Thorsten Strufe,
- Abstract summary: DP-SGD is the standard optimization algorithm for privacy-preserving machine learning.
It is still commonly challenged by low performance compared to non-private learning approaches.
- Score: 3.0668784884950235
- License:
- Abstract: Research on the effects of essential hyperparameters of DP-SGD lacks consensus, verification, and replication. Contradictory and anecdotal statements on their influence make matters worse. While DP-SGD is the standard optimization algorithm for privacy-preserving machine learning, its adoption is still commonly challenged by low performance compared to non-private learning approaches. As proper hyperparameter settings can improve the privacy-utility trade-off, understanding the influence of the hyperparameters promises to simplify their optimization towards better performance, and likely foster acceptance of private learning. To shed more light on these influences, we conduct a replication study: We synthesize extant research on hyperparameter influences of DP-SGD into conjectures, conduct a dedicated factorial study to independently identify hyperparameter effects, and assess which conjectures can be replicated across multiple datasets, model architectures, and differential privacy budgets. While we cannot (consistently) replicate conjectures about the main and interaction effects of the batch size and the number of epochs, we were able to replicate the conjectured relationship between the clipping threshold and learning rate. Furthermore, we were able to quantify the significant importance of their combination compared to the other hyperparameters.
Related papers
- Universally Harmonizing Differential Privacy Mechanisms for Federated Learning: Boosting Accuracy and Convergence [22.946928984205588]
Differentially private federated learning (DP-FL) is a promising technique for collaborative model training.
We propose the first DP-FL framework (namely UDP-FL) which universally harmonizes any randomization mechanism.
We show that UDP-FL exhibits substantial resilience against different inference attacks.
arXiv Detail & Related papers (2024-07-20T00:11:59Z) - Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and
Dysarthric Speech Recognition [64.9816313630768]
Fine-tuning is often used to exploit the large quantities of non-aged and healthy speech pre-trained models.
This paper investigates hyper- parameter adaptation for Conformer ASR systems that are pre-trained on the Librispeech corpus.
arXiv Detail & Related papers (2023-06-27T07:49:35Z) - Exploring Machine Learning Privacy/Utility trade-off from a
hyperparameters Lens [10.727571921061024]
Differentially Private Descent Gradient (DPSGD) is the state-of-the-art method to train privacy-preserving models.
With a drop-in replacement of the activation function, we achieve new state-of-the-art accuracy.
arXiv Detail & Related papers (2023-03-03T09:59:42Z) - An Experimental Study on Private Aggregation of Teacher Ensemble
Learning for End-to-End Speech Recognition [51.232523987916636]
Differential privacy (DP) is one data protection avenue to safeguard user information used for training deep models by imposing noisy distortion on privacy data.
In this work, we extend PATE learning to work with dynamic patterns, namely speech, and perform one very first experimental study on ASR to avoid acoustic data leakage.
arXiv Detail & Related papers (2022-10-11T16:55:54Z) - Towards Differential Relational Privacy and its use in Question
Answering [109.4452196071872]
Memorization of relation between entities in a dataset can lead to privacy issues when using a trained question answering model.
We quantify this phenomenon and provide a possible definition of Differential Privacy (DPRP)
We illustrate concepts in experiments with largescale models for Question Answering.
arXiv Detail & Related papers (2022-03-30T22:59:24Z) - Differentially Private Estimation of Heterogeneous Causal Effects [9.355532300027727]
We introduce a general meta-algorithm for estimating conditional average treatment effects (CATE) with differential privacy guarantees.
Our meta-algorithm can work with simple, single-stage CATE estimators such as S-learner and more complex multi-stage estimators such as DR and R-learner.
arXiv Detail & Related papers (2022-02-22T17:21:18Z) - Explaining Hyperparameter Optimization via Partial Dependence Plots [5.25855526614851]
We suggest using interpretable machine learning (IML) to gain insights from the experimental data obtained during HPO with Bayesian optimization (BO)
By leveraging the posterior uncertainty of the BO surrogate model, we introduce a variant of the partial dependence plot ( PDP) with estimated confidence bands.
In an experimental study, we provide quantitative evidence for the increased quality of the PDPs within sub-regions.
arXiv Detail & Related papers (2021-11-08T20:51:54Z) - Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models.
PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z) - An Asymptotically Optimal Multi-Armed Bandit Algorithm and
Hyperparameter Optimization [48.5614138038673]
We propose an efficient and robust bandit-based algorithm called Sub-Sampling (SS) in the scenario of hyper parameter search evaluation.
We also develop a novel hyper parameter optimization algorithm called BOSS.
Empirical studies validate our theoretical arguments of SS and demonstrate the superior performance of BOSS on a number of applications.
arXiv Detail & Related papers (2020-07-11T03:15:21Z) - Differentially Private Federated Learning with Laplacian Smoothing [72.85272874099644]
Federated learning aims to protect data privacy by collaboratively learning a model without sharing private data among users.
An adversary may still be able to infer the private training data by attacking the released model.
Differential privacy provides a statistical protection against such attacks at the price of significantly degrading the accuracy or utility of the trained models.
arXiv Detail & Related papers (2020-05-01T04:28:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.