Practical Differentially Private Hyperparameter Tuning with Subsampling
- URL: http://arxiv.org/abs/2301.11989v3
- Date: Tue, 13 Feb 2024 14:15:01 GMT
- Title: Practical Differentially Private Hyperparameter Tuning with Subsampling
- Authors: Antti Koskela and Tejas Kulkarni
- Abstract summary: We propose a new class of differentially private (DP) machine learning (ML) algorithms, where the number of random search samples is randomized itself.
We focus on lowering both the DP bounds and the computational cost of these methods by using only a random subset of the sensitive data.
We provide a R'enyi differential privacy analysis for the proposed method and experimentally show that it consistently leads to better privacy-utility trade-off.
- Score: 8.022555128083026
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Tuning the hyperparameters of differentially private (DP) machine learning
(ML) algorithms often requires use of sensitive data and this may leak private
information via hyperparameter values. Recently, Papernot and Steinke (2022)
proposed a certain class of DP hyperparameter tuning algorithms, where the
number of random search samples is randomized itself. Commonly, these
algorithms still considerably increase the DP privacy parameter $\varepsilon$
over non-tuned DP ML model training and can be computationally heavy as
evaluating each hyperparameter candidate requires a new training run. We focus
on lowering both the DP bounds and the computational cost of these methods by
using only a random subset of the sensitive data for the hyperparameter tuning
and by extrapolating the optimal values to a larger dataset. We provide a
R\'enyi differential privacy analysis for the proposed method and
experimentally show that it consistently leads to better privacy-utility
trade-off than the baseline method by Papernot and Steinke.
Related papers
- DP-HyPO: An Adaptive Private Hyperparameter Optimization Framework [31.628466186344582]
We introduce DP-HyPO, a pioneering framework for adaptive'' private hyperparameter optimization.
We provide a comprehensive differential privacy analysis of our framework.
We empirically demonstrate the effectiveness of DP-HyPO on a diverse set of real-world datasets.
arXiv Detail & Related papers (2023-06-09T07:55:46Z) - A New Linear Scaling Rule for Private Adaptive Hyperparameter Optimization [57.450449884166346]
We propose an adaptive HPO method to account for the privacy cost of HPO.
We obtain state-of-the-art performance on 22 benchmark tasks, across computer vision and natural language processing, across pretraining and finetuning.
arXiv Detail & Related papers (2022-12-08T18:56:37Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Automatic tuning of hyper-parameters of reinforcement learning
algorithms using Bayesian optimization with behavioral cloning [0.0]
In reinforcement learning (RL), the information content of data gathered by the learning agent is dependent on the setting of many hyper- parameters.
In this work, a novel approach for autonomous hyper- parameter setting using Bayesian optimization is proposed.
Experiments reveal promising results compared to other manual tweaking and optimization-based approaches.
arXiv Detail & Related papers (2021-12-15T13:10:44Z) - Differentially Private Coordinate Descent for Composite Empirical Risk
Minimization [13.742100810492014]
Machine learning models can leak information about the data used to train them.
Differentially Private (DP) variants of optimization algorithms like Gradient Descent (DP-SGD) have been designed to mitigate this.
We propose a new method for composite Differentially Private Empirical Risk Minimization (DP-ERM): Differentially Private Coordinate Descent (DP-CD)
arXiv Detail & Related papers (2021-10-22T10:22:48Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - Hyperparameter Selection for Subsampling Bootstraps [0.0]
A subsampling method like BLB serves as a powerful tool for assessing the quality of estimators for massive data.
The performance of the subsampling methods are highly influenced by the selection of tuning parameters.
We develop a hyperparameter selection methodology, which can be used to select tuning parameters for subsampling methods.
Both simulation studies and real data analysis demonstrate the superior advantage of our method.
arXiv Detail & Related papers (2020-06-02T17:10:45Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.