Privacy Budget Scheduling
- URL: http://arxiv.org/abs/2106.15335v1
- Date: Tue, 29 Jun 2021 12:43:47 GMT
- Title: Privacy Budget Scheduling
- Authors: Tao Luo, Mingen Pan, Pierre Tholoniat, Asaf Cidon, Roxana Geambasu,
Mathias L\'ecuyer
- Abstract summary: ML models trained on personal data have been shown to leak information about users.
Differential privacy (DP) enables model training with a guaranteed bound on this leakage.
We describe PrivateKube, an extension to the popular datacenter orchestrator.
- Score: 3.5329693371326822
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning (ML) models trained on personal data have been shown to leak
information about users. Differential privacy (DP) enables model training with
a guaranteed bound on this leakage. Each new model trained with DP increases
the bound on data leakage and can be seen as consuming part of a global privacy
budget that should not be exceeded. This budget is a scarce resource that must
be carefully managed to maximize the number of successfully trained models.
We describe PrivateKube, an extension to the popular Kubernetes datacenter
orchestrator that adds privacy as a new type of resource to be managed
alongside other traditional compute resources, such as CPU, GPU, and memory.
The abstractions we design for the privacy resource mirror those defined by
Kubernetes for traditional resources, but there are also major differences. For
example, traditional compute resources are replenishable while privacy is not:
a CPU can be regained after a model finishes execution while privacy budget
cannot. This distinction forces a re-design of the scheduler. We present DPF
(Dominant Private Block Fairness) -- a variant of the popular Dominant Resource
Fairness (DRF) algorithm -- that is geared toward the non-replenishable privacy
resource but enjoys similar theoretical properties as DRF.
We evaluate PrivateKube and DPF on microbenchmarks and an ML workload on
Amazon Reviews data. Compared to existing baselines, DPF allows training more
models under the same global privacy guarantee. This is especially true for DPF
over R\'enyi DP, a highly composable form of DP.
Related papers
- Privacy Profiles for Private Selection [21.162924003105484]
We work out an easy-to-use recipe that bounds privacy profiles of ReportNoisyMax and PrivateTuning using the privacy profiles of the base algorithms they corral.
Our approach improves over all regimes of interest and leads to substantial benefits in end-to-end private learning experiments.
arXiv Detail & Related papers (2024-02-09T08:31:46Z) - Differentially Private Model-Based Offline Reinforcement Learning [51.1231068185106]
We introduce DP-MORL, an algorithm coming with differential privacy guarantees.
A private model of the environment is first learned from offline data.
We then use model-based policy optimization to derive a policy from the private model.
arXiv Detail & Related papers (2024-02-08T10:05:11Z) - Private Fine-tuning of Large Language Models with Zeroth-order
Optimization [54.24600476755372]
We introduce DP-ZO, a new method for fine-tuning large language models that preserves the privacy of training data by privatizing zeroth-order optimization.
We show that DP-ZO exhibits just $1.86%$ performance degradation due to privacy at $ (1,10-5)$-DP when fine-tuning OPT-66B on 1000 training samples from SQuAD.
arXiv Detail & Related papers (2024-01-09T03:53:59Z) - Packing Privacy Budget Efficiently [10.51351125953885]
differential privacy (DP) provides a rigorous way to bound that leakage under a given budget.
This DP budget can be regarded as a new type of compute resource in workloads of multiple ML models training on user data.
We formulate privacy scheduling as a new type of multidimensional knapsack problem, called privacy knapsack, which maximizes DP budget efficiency.
arXiv Detail & Related papers (2022-12-26T17:25:02Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Pre-trained Perceptual Features Improve Differentially Private Image
Generation [8.659595986100738]
Training even moderately-sized generative models with differentially-private descent gradient (DP-SGD) is difficult.
We advocate building off a good, relevant representation on an informative public dataset, then learning to model the private data with that representation.
Our work introduces simple yet powerful foundations for reducing the gap between private and non-private deep generative models.
arXiv Detail & Related papers (2022-05-25T16:46:01Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - FeO2: Federated Learning with Opt-Out Differential Privacy [34.08435990347253]
Federated learning (FL) is an emerging privacy-preserving paradigm, where a global model is trained at a central server while keeping client data local.
Differential privacy (DP) can be employed to provide privacy guarantees within FL, typically at the cost of degraded final trained model.
We propose a new algorithm for federated learning with opt-out DP, referred to as emphFeO2, along with a discussion on its advantages compared to the baselines of private and personalized FL algorithms.
arXiv Detail & Related papers (2021-10-28T16:08:18Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.