Differentially Private Temporal Difference Learning with Stochastic
Nonconvex-Strongly-Concave Optimization
- URL: http://arxiv.org/abs/2201.10447v1
- Date: Tue, 25 Jan 2022 16:48:29 GMT
- Title: Differentially Private Temporal Difference Learning with Stochastic
Nonconvex-Strongly-Concave Optimization
- Authors: Canzhe Zhao, Yanjie Ze, Jing Dong, Baoxiang Wang, Shuai Li
- Abstract summary: temporal difference (TD) learning is a widely used method to evaluate policies in reinforcement learning.
In this paper, we consider preserving privacy in TD learning with a nonlinear value function.
We show that DPTD could provide $epsilon,n-differential privacy (DP) guarantee for sensitive information encoded in transitions and retain the original power of TD learning.
- Score: 17.361143427007224
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Temporal difference (TD) learning is a widely used method to evaluate
policies in reinforcement learning. While many TD learning methods have been
developed in recent years, little attention has been paid to preserving privacy
and most of the existing approaches might face the concerns of data privacy
from users. To enable complex representative abilities of policies, in this
paper, we consider preserving privacy in TD learning with nonlinear value
function approximation. This is challenging because such a nonlinear problem is
usually studied in the formulation of stochastic nonconvex-strongly-concave
optimization to gain finite-sample analysis, which would require simultaneously
preserving the privacy on primal and dual sides. To this end, we employ a
momentum-based stochastic gradient descent ascent to achieve a single-timescale
algorithm, and achieve a good trade-off between meaningful privacy and utility
guarantees of both the primal and dual sides by perturbing the gradients on
both sides using well-calibrated Gaussian noises. As a result, our DPTD
algorithm could provide $(\epsilon,\delta)$-differential privacy (DP) guarantee
for the sensitive information encoded in transitions and retain the original
power of TD learning, with the utility upper bounded by
$\widetilde{\mathcal{O}}(\frac{(d\log(1/\delta))^{1/8}}{(n\epsilon)^{1/4}})$
(The tilde in this paper hides the log factor.), where $n$ is the trajectory
length and $d$ is the dimension. Extensive experiments conducted in OpenAI Gym
show the advantages of our proposed algorithm.
Related papers
- Privacy for Free in the Over-Parameterized Regime [19.261178173399784]
Differentially private gradient descent (DP-GD) is a popular algorithm to train deep learning models with provable guarantees on the privacy of the training data.
In this work, we show that in the popular random features model with quadratic loss, for any sufficiently large $p$, privacy can be obtained for free, i.e., $left|R_P right| = o(1)$, not only when the privacy parameter $varepsilon$ has constant order, but also in the strongly private setting $varepsilon = o(1)$.
arXiv Detail & Related papers (2024-10-18T18:01:11Z) - Directional Privacy for Deep Learning [2.826489388853448]
Differentially Private Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models.
Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable for preserving utility.
We show that this provides both $epsilon$-DP and $epsilon d$-privacy for deep learning training, rather than the $(epsilon, delta)$-privacy of the Gaussian mechanism.
arXiv Detail & Related papers (2022-11-09T05:18:08Z) - Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection.
This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - On Private Online Convex Optimization: Optimal Algorithms in
$\ell_p$-Geometry and High Dimensional Contextual Bandits [9.798304879986604]
We study the Differentially private (DP) convex optimization problem with streaming data sampled from a distribution and arrives sequentially.
We also consider the continual release model where parameters related to private information are updated and released upon each new data, often known as the online algorithms.
Our algorithm achieves in linear time the optimal excess risk when $1pleq 2$ and the state-of-the-art excess risk meeting the non-private lower ones when $2pleqinfty$.
arXiv Detail & Related papers (2022-06-16T12:09:47Z) - Differentially Private Federated Learning via Inexact ADMM [0.0]
Differential privacy (DP) techniques can be applied to the federated learning model to protect data privacy against inference attacks.
We develop a DP inexact alternating direction method of multipliers algorithm that solves a sequence of trust-region subproblems.
Our algorithm reduces the testing error by at most $22%$ compared with the existing DP algorithm, while achieving the same level of data privacy.
arXiv Detail & Related papers (2021-06-11T02:28:07Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy
Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data.
Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z) - Private Stochastic Non-Convex Optimization: Adaptive Algorithms and
Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization.
We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z) - User-Level Privacy-Preserving Federated Learning: Analysis and
Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models.
From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs.
We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.