LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models
- URL: http://arxiv.org/abs/2404.08847v1
- Date: Fri, 12 Apr 2024 23:32:06 GMT
- Title: LazyDP: Co-Designing Algorithm-Software for Scalable Training of Differentially Private Recommendation Models
- Authors: Juntaek Lim, Youngeun Kwon, Ranggi Hwang, Kiwan Maeng, G. Edward Suh, Minsoo Rhu,
- Abstract summary: We present our characterization of private RecSys training using DP-SGD, root-causing its several performance bottlenecks.
We propose LazyDP, an algorithm-software co-design that addresses the compute and memory challenges of training RecSys with DP-SGD.
Compared to a state-of-the-art DP-SGD training system, we demonstrate that LazyDP provides an average 119x training throughput improvement.
- Score: 8.92538797216985
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Differential privacy (DP) is widely being employed in the industry as a practical standard for privacy protection. While private training of computer vision or natural language processing applications has been studied extensively, the computational challenges of training of recommender systems (RecSys) with DP have not been explored. In this work, we first present our detailed characterization of private RecSys training using DP-SGD, root-causing its several performance bottlenecks. Specifically, we identify DP-SGD's noise sampling and noisy gradient update stage to suffer from a severe compute and memory bandwidth limitation, respectively, causing significant performance overhead in training private RecSys. Based on these findings, we propose LazyDP, an algorithm-software co-design that addresses the compute and memory challenges of training RecSys with DP-SGD. Compared to a state-of-the-art DP-SGD training system, we demonstrate that LazyDP provides an average 119x training throughput improvement while also ensuring mathematically equivalent, differentially private RecSys models to be trained.
Related papers
- Towards Efficient and Scalable Training of Differentially Private Deep Learning [4.543581742916529]
Differentially private gradient descent (DP-SGD) is the standard algorithm for training machine learning models under differential privacy (DP)
The major drawback of DP-SGD is the drop in utility which prior work has comprehensively studied.
We conduct a comprehensive empirical study to quantify the computational cost of training deep learning models under DP and benchmark methods that aim at reducing the cost.
arXiv Detail & Related papers (2024-06-25T06:04:58Z) - Pre-training Differentially Private Models with Limited Public Data [58.945400707033016]
differential privacy (DP) is a prominent method to gauge the degree of security provided to the models.
DP is yet not capable of protecting a substantial portion of the data used during the initial pre-training stage.
We propose a novel DP continual pre-training strategy using only 10% of public data.
arXiv Detail & Related papers (2024-02-28T23:26:27Z) - Sparsity-Preserving Differentially Private Training of Large Embedding
Models [67.29926605156788]
DP-SGD is a training algorithm that combines differential privacy with gradient descent.
Applying DP-SGD naively to embedding models can destroy gradient sparsity, leading to reduced training efficiency.
We present two new algorithms, DP-FEST and DP-AdaFEST, that preserve gradient sparsity during private training of large embedding models.
arXiv Detail & Related papers (2023-11-14T17:59:51Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - Large Language Models Can Be Strong Differentially Private Learners [70.0317718115406]
Differentially Private (DP) learning has seen limited success for building large deep learning models of text.
We show that this performance drop can be mitigated with the use of large pretrained models.
We propose a memory saving technique that allows clipping in DP-SGD to run without instantiating per-example gradients.
arXiv Detail & Related papers (2021-10-12T01:45:27Z) - An Efficient DP-SGD Mechanism for Large Scale NLP Models [28.180412581994485]
Data used to train Natural Language Understanding (NLU) models may contain private information such as addresses or phone numbers.
It is desirable that underlying models do not expose private information contained in the training data.
Differentially Private Gradient Descent (DP-SGD) has been proposed as a mechanism to build privacy-preserving models.
arXiv Detail & Related papers (2021-07-14T15:23:27Z) - DPlis: Boosting Utility of Differentially Private Deep Learning via
Randomized Smoothing [0.0]
We propose DPlis--Differentially Private Learning wIth Smoothing.
We show that DPlis can effectively boost model quality and training stability under a given privacy budget.
arXiv Detail & Related papers (2021-03-02T06:33:14Z) - Fast and Memory Efficient Differentially Private-SGD via JL Projections [29.37156662314245]
DP-SGD is the only known algorithm for private training of large scale neural networks.
We present a new framework to design differentially privates called DP-SGD-JL and DP-Adam-JL.
arXiv Detail & Related papers (2021-02-05T06:02:10Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.