Data Efficient Subset Training with Differential Privacy
- URL: http://arxiv.org/abs/2503.06732v1
- Date: Sun, 09 Mar 2025 19:05:10 GMT
- Title: Data Efficient Subset Training with Differential Privacy
- Authors: Ninad Jayesh Gandhi, Moparthy Venkata Subrahmanya Sri Harsha,
- Abstract summary: We adapt GLISTER to the private setting and extensively assess its performance.<n>We empirically find that practical choices of privacy budgets are too restrictive for data efficient training in the private setting.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Private machine learning introduces a trade-off between the privacy budget and training performance. Training convergence is substantially slower and extensive hyper parameter tuning is required. Consequently, efficient methods to conduct private training of models is thoroughly investigated in the literature. To this end, we investigate the strength of the data efficient model training methods in the private training setting. We adapt GLISTER (Killamsetty et al., 2021b) to the private setting and extensively assess its performance. We empirically find that practical choices of privacy budgets are too restrictive for data efficient training in the private setting.
Related papers
- Differential Privacy Personalized Federated Learning Based on Dynamically Sparsified Client Updates [12.373620724244475]
We propose a differentially private personalized federated learning approach that employs dynamically sparsified client updates.
Experimental results on EMNIST, CIFAR-10, and CIFAR-100 demonstrate that our proposed scheme achieves superior performance.
arXiv Detail & Related papers (2025-03-12T09:34:05Z) - Task-Oriented Pre-Training for Drivable Area Detection [5.57325257338134]
We propose a task-oriented pre-training method that begins with generating redundant segmentation proposals.
We then introduce a Specific Category Enhancement Fine-tuning (SCEF) strategy for fine-tuning the Contrastive Language-Image Pre-training (CLIP) model.
This approach can generate a lot of coarse training data for pre-training models, which are further fine-tuned using manually annotated data.
arXiv Detail & Related papers (2024-09-30T10:25:47Z) - Too Good to be True? Turn Any Model Differentially Private With DP-Weights [0.0]
We introduce a groundbreaking approach that applies differential privacy noise to the model's weights after training.<n>We offer a comprehensive mathematical proof for this novel approach's privacy bounds.<n>We empirically evaluate its effectiveness using membership inference attacks and performance evaluations.
arXiv Detail & Related papers (2024-06-27T19:58:11Z) - Differentially Private Deep Model-Based Reinforcement Learning [47.651861502104715]
We introduce PriMORL, a model-based RL algorithm with formal differential privacy guarantees.
PriMORL learns an ensemble of trajectory-level DP models of the environment from offline data.
arXiv Detail & Related papers (2024-02-08T10:05:11Z) - Unlocking Accuracy and Fairness in Differentially Private Image
Classification [43.53494043189235]
Differential privacy (DP) is considered the gold standard framework for privacy-preserving training.
We show that pre-trained foundation models fine-tuned with DP can achieve similar accuracy to non-private classifiers.
arXiv Detail & Related papers (2023-08-21T17:42:33Z) - Can Public Large Language Models Help Private Cross-device Federated Learning? [58.05449579773249]
We study (differentially) private federated learning (FL) of language models.
Public data has been used to improve privacy-utility trade-offs for both large and small language models.
We propose a novel distribution matching algorithm with theoretical grounding to sample public data close to private data distribution.
arXiv Detail & Related papers (2023-05-20T07:55:58Z) - Incentivising the federation: gradient-based metrics for data selection and valuation in private decentralised training [15.233103072063951]
We investigate how to leverage gradient information to permit the participants of private training settings to select the data most beneficial for the jointly trained model.
We show that these techniques can provide the federated clients with tools for principled data selection even in stricter privacy settings.
arXiv Detail & Related papers (2023-05-04T15:44:56Z) - Why Is Public Pretraining Necessary for Private Model Training? [50.054565310457306]
We show that pretraining on publicly available data leads to distinct gains over nonprivate settings.
We argue that the tradeoff may be a deeper loss model that requires an algorithm to go through two phases.
Guided by intuition, we provide theoretical constructions that provably demonstrate the separation between private with and without public pretraining.
arXiv Detail & Related papers (2023-02-19T05:32:20Z) - Position: Considerations for Differentially Private Learning with Large-Scale Public Pretraining [75.25943383604266]
We question whether the use of large Web-scraped datasets should be viewed as differential-privacy-preserving.
We caution that publicizing these models pretrained on Web data as "private" could lead to harm and erode the public's trust in differential privacy as a meaningful definition of privacy.
We conclude by discussing potential paths forward for the field of private learning, as public pretraining becomes more popular and powerful.
arXiv Detail & Related papers (2022-12-13T10:41:12Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Personalization Improves Privacy-Accuracy Tradeoffs in Federated
Optimization [57.98426940386627]
We show that coordinating local learning with private centralized learning yields a generically useful and improved tradeoff between accuracy and privacy.
We illustrate our theoretical results with experiments on synthetic and real-world datasets.
arXiv Detail & Related papers (2022-02-10T20:44:44Z) - Large Language Models Can Be Strong Differentially Private Learners [70.0317718115406]
Differentially Private (DP) learning has seen limited success for building large deep learning models of text.
We show that this performance drop can be mitigated with the use of large pretrained models.
We propose a memory saving technique that allows clipping in DP-SGD to run without instantiating per-example gradients.
arXiv Detail & Related papers (2021-10-12T01:45:27Z) - DPlis: Boosting Utility of Differentially Private Deep Learning via
Randomized Smoothing [0.0]
We propose DPlis--Differentially Private Learning wIth Smoothing.
We show that DPlis can effectively boost model quality and training stability under a given privacy budget.
arXiv Detail & Related papers (2021-03-02T06:33:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.