Related papers: Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization

Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization

URL: http://arxiv.org/abs/2201.10447v1
Date: Tue, 25 Jan 2022 16:48:29 GMT
Title: Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization
Authors: Canzhe Zhao, Yanjie Ze, Jing Dong, Baoxiang Wang, Shuai Li
Abstract summary: temporal difference (TD) learning is a widely used method to evaluate policies in reinforcement learning. In this paper, we consider preserving privacy in TD learning with a nonlinear value function. We show that DPTD could provide $epsilon,n-differential privacy (DP) guarantee for sensitive information encoded in transitions and retain the original power of TD learning.
Score: 17.361143427007224
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Temporal difference (TD) learning is a widely used method to evaluate policies in reinforcement learning. While many TD learning methods have been developed in recent years, little attention has been paid to preserving privacy and most of the existing approaches might face the concerns of data privacy from users. To enable complex representative abilities of policies, in this paper, we consider preserving privacy in TD learning with nonlinear value function approximation. This is challenging because such a nonlinear problem is usually studied in the formulation of stochastic nonconvex-strongly-concave optimization to gain finite-sample analysis, which would require simultaneously preserving the privacy on primal and dual sides. To this end, we employ a momentum-based stochastic gradient descent ascent to achieve a single-timescale algorithm, and achieve a good trade-off between meaningful privacy and utility guarantees of both the primal and dual sides by perturbing the gradients on both sides using well-calibrated Gaussian noises. As a result, our DPTD algorithm could provide $(\epsilon,\delta)$-differential privacy (DP) guarantee for the sensitive information encoded in transitions and retain the original power of TD learning, with the utility upper bounded by $\widetilde{\mathcal{O}}(\frac{(d\log(1/\delta))^{1/8}}{(n\epsilon)^{1/4}})$ (The tilde in this paper hides the log factor.), where $n$ is the trajectory length and $d$ is the dimension. Extensive experiments conducted in OpenAI Gym show the advantages of our proposed algorithm.

Related papers

Smoothed Normalization for Efficient Distributed Private Optimization [54.197255548244705]
Federated learning enables machine learning models with privacy of participants. There is no differentially private distributed method for training, non-feedback problems. We introduce a new distributed algorithm $alpha$-$sf NormEC$ with provable convergence guarantees.
arXiv Detail & Related papers (2025-02-19T07:10:32Z)
Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications. Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility. We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Privacy for Free in the Over-Parameterized Regime [19.261178173399784]
Differentially private gradient descent (DP-GD) is a popular algorithm to train deep learning models with provable guarantees on the privacy of the training data. In this work, we show that in the popular random features model with quadratic loss, for any sufficiently large $p$, privacy can be obtained for free, i.e., $left|R_P right| = o(1)$, not only when the privacy parameter $varepsilon$ has constant order, but also in the strongly private setting $varepsilon = o(1)$.
arXiv Detail & Related papers (2024-10-18T18:01:11Z)
Directional Privacy for Deep Learning [2.826489388853448]
Differentially Private Gradient Descent (DP-SGD) is a key method for applying privacy in the training of deep learning models. Metric DP, however, can provide alternative mechanisms based on arbitrary metrics that might be more suitable for preserving utility. We show that this provides both $epsilon$-DP and $epsilon d$-privacy for deep learning training, rather than the $(epsilon, delta)$-privacy of the Gaussian mechanism.
arXiv Detail & Related papers (2022-11-09T05:18:08Z)
Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z)
Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data. We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
On Private Online Convex Optimization: Optimal Algorithms in $\ell_p$-Geometry and High Dimensional Contextual Bandits [9.798304879986604]
We study the Differentially private (DP) convex optimization problem with streaming data sampled from a distribution and arrives sequentially. We also consider the continual release model where parameters related to private information are updated and released upon each new data, often known as the online algorithms. Our algorithm achieves in linear time the optimal excess risk when $1pleq 2$ and the state-of-the-art excess risk meeting the non-private lower ones when $2pleqinfty$.
arXiv Detail & Related papers (2022-06-16T12:09:47Z)
Differentially Private Federated Learning via Inexact ADMM [0.0]
Differential privacy (DP) techniques can be applied to the federated learning model to protect data privacy against inference attacks. We develop a DP inexact alternating direction method of multipliers algorithm that solves a sequence of trust-region subproblems. Our algorithm reduces the testing error by at most $22%$ compared with the existing DP algorithm, while achieving the same level of data privacy.
arXiv Detail & Related papers (2021-06-11T02:28:07Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling [49.43288037509783]
We show that random shuffling amplifies differential privacy guarantees of locally randomized data. Our result is based on a new approach that is simpler than previous work and extends to approximate differential privacy with nearly the same guarantees.
arXiv Detail & Related papers (2020-12-23T17:07:26Z)
Private Stochastic Non-Convex Optimization: Adaptive Algorithms and Tighter Generalization Bounds [72.63031036770425]
We propose differentially private (DP) algorithms for bound non-dimensional optimization. We demonstrate two popular deep learning methods on the empirical advantages over standard gradient methods.
arXiv Detail & Related papers (2020-06-24T06:01:24Z)
User-Level Privacy-Preserving Federated Learning: Analysis and Performance Optimization [77.43075255745389]
Federated learning (FL) is capable of preserving private data from mobile terminals (MTs) while training the data into useful models. From a viewpoint of information theory, it is still possible for a curious server to infer private information from the shared models uploaded by MTs. We propose a user-level differential privacy (UDP) algorithm by adding artificial noise to the shared models before uploading them to servers.
arXiv Detail & Related papers (2020-02-29T10:13:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.