The importance of feature preprocessing for differentially private
linear optimization
- URL: http://arxiv.org/abs/2307.11106v2
- Date: Mon, 19 Feb 2024 21:11:32 GMT
- Title: The importance of feature preprocessing for differentially private
linear optimization
- Authors: Ziteng Sun, Ananda Theertha Suresh, Aditya Krishna Menon
- Abstract summary: One of the most popular algorithms for training differentially private models is differentially private gradient descent (DPSGD)
We show that even for the simple case of linear classification, unlike non-private optimization, (private) feature preprocessing is vital for differentially private optimization.
We propose an algorithm called DPSGDF, which combines DPSGD with feature preprocessing and prove that for classification tasks, it incurs an optimality gap proportional to the diameter of the features.
- Score: 38.125699428109826
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training machine learning models with differential privacy (DP) has received
increasing interest in recent years. One of the most popular algorithms for
training differentially private models is differentially private stochastic
gradient descent (DPSGD) and its variants, where at each step gradients are
clipped and combined with some noise. Given the increasing usage of DPSGD, we
ask the question: is DPSGD alone sufficient to find a good minimizer for every
dataset under privacy constraints? Towards answering this question, we show
that even for the simple case of linear classification, unlike non-private
optimization, (private) feature preprocessing is vital for differentially
private optimization. In detail, we first show theoretically that there exists
an example where without feature preprocessing, DPSGD incurs an optimality gap
proportional to the maximum Euclidean norm of features over all samples. We
then propose an algorithm called DPSGD-F, which combines DPSGD with feature
preprocessing and prove that for classification tasks, it incurs an optimality
gap proportional to the diameter of the features $\max_{x, x' \in D} \|x -
x'\|_2$. We finally demonstrate the practicality of our algorithm on image
classification benchmarks.
Related papers
- DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction [57.83978915843095]
This paper introduces DiSK, a novel framework designed to significantly enhance the performance of differentially private gradients.
To ensure practicality for large-scale training, we simplify the Kalman filtering process, minimizing its memory and computational demands.
arXiv Detail & Related papers (2024-10-04T19:30:39Z) - Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Private Fine-tuning of Large Language Models with Zeroth-order Optimization [51.19403058739522]
Differentially private gradient descent (DP-SGD) allows models to be trained in a privacy-preserving manner.
We introduce DP-ZO, a private fine-tuning framework for large language models by privatizing zeroth order optimization methods.
arXiv Detail & Related papers (2024-01-09T03:53:59Z) - Differentially Private SGD Without Clipping Bias: An Error-Feedback Approach [62.000948039914135]
Using Differentially Private Gradient Descent with Gradient Clipping (DPSGD-GC) to ensure Differential Privacy (DP) comes at the cost of model performance degradation.
We propose a new error-feedback (EF) DP algorithm as an alternative to DPSGD-GC.
We establish an algorithm-specific DP analysis for our proposed algorithm, providing privacy guarantees based on R'enyi DP.
arXiv Detail & Related papers (2023-11-24T17:56:44Z) - DPAF: Image Synthesis via Differentially Private Aggregation in Forward
Phase [14.76128148793876]
DPAF is an effective differentially private generative model for high-dimensional image synthesis.
It reduces information loss in clipping gradient and low sensitivity for the aggregation.
It also tackles the problem of setting a proper batch size by proposing a novel training strategy that asymmetrically trains different parts of the discriminator.
arXiv Detail & Related papers (2023-04-20T16:32:02Z) - Differentially Private Learning with Per-Sample Adaptive Clipping [8.401653565794353]
We propose a Differentially Private Per-Sample Adaptive Clipping (DP-PSAC) algorithm based on a non-monotonic adaptive weight function.
We show that DP-PSAC outperforms or matches the state-of-the-art methods on multiple main-stream vision and language tasks.
arXiv Detail & Related papers (2022-12-01T07:26:49Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.