Improved Regret for Differentially Private Exploration in Linear MDP
- URL: http://arxiv.org/abs/2202.01292v1
- Date: Wed, 2 Feb 2022 21:32:09 GMT
- Title: Improved Regret for Differentially Private Exploration in Linear MDP
- Authors: Dung Daniel Ngo, Giuseppe Vietri, Zhiwei Steven Wu
- Abstract summary: We study privacy-preserving exploration in sequential decision-making for environments that rely on sensitive data such as medical records.
We provide a private algorithm with an improved regret rate with an optimal dependence of $O(sqrtK)$ on the number of episodes.
- Score: 31.567811502343552
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study privacy-preserving exploration in sequential decision-making for
environments that rely on sensitive data such as medical records. In
particular, we focus on solving the problem of reinforcement learning (RL)
subject to the constraint of (joint) differential privacy in the linear MDP
setting, where both dynamics and rewards are given by linear functions. Prior
work on this problem due to Luyo et al. (2021) achieves a regret rate that has
a dependence of $O(K^{3/5})$ on the number of episodes $K$. We provide a
private algorithm with an improved regret rate with an optimal dependence of
$O(\sqrt{K})$ on the number of episodes. The key recipe for our stronger regret
guarantee is the adaptivity in the policy update schedule, in which an update
only occurs when sufficient changes in the data are detected. As a result, our
algorithm benefits from low switching cost and only performs $O(\log(K))$
updates, which greatly reduces the amount of privacy noise. Finally, in the
most prevalent privacy regimes where the privacy parameter $\epsilon$ is a
constant, our algorithm incurs negligible privacy cost -- in comparison with
the existing non-private regret bounds, the additional regret due to privacy
appears in lower-order terms.
Related papers
- Privacy for Free in the Over-Parameterized Regime [19.261178173399784]
Differentially private gradient descent (DP-GD) is a popular algorithm to train deep learning models with provable guarantees on the privacy of the training data.
In this work, we show that in the popular random features model with quadratic loss, for any sufficiently large $p$, privacy can be obtained for free, i.e., $left|R_P right| = o(1)$, not only when the privacy parameter $varepsilon$ has constant order, but also in the strongly private setting $varepsilon = o(1)$.
arXiv Detail & Related papers (2024-10-18T18:01:11Z) - On the Complexity of Differentially Private Best-Arm Identification with
Fixed Confidence [16.295693624977563]
We study the problem of Best Arm Identification with fixed confidence under $epsilon$-global Differential Privacy.
Our lower bound suggests the existence of two privacy regimes depending on the privacy budget.
We propose AdaP-TT, an $epsilon$-global DP variant of the Top Two algorithm.
arXiv Detail & Related papers (2023-09-05T13:07:25Z) - "Private Prediction Strikes Back!'' Private Kernelized Nearest Neighbors
with Individual Renyi Filter [31.970442970375153]
We propose an algorithm named Individualized Nearest Neighbor (Ind-KNN) for private prediction.
Ind-KNN is easily updatable over dataset changes and it allows precise control of the R'enyi at an individual user level.
Our results show that Ind-KNN consistently improves the accuracy over existing private prediction methods for a wide range of $epsilon$ on four vision and language tasks.
arXiv Detail & Related papers (2023-06-12T19:14:45Z) - Differentially Private Image Classification by Learning Priors from
Random Processes [48.0766422536737]
In privacy-preserving machine learning, differentially private gradient descent (DP-SGD) performs worse than SGD due to per-sample gradient clipping and noise addition.
A recent focus in private learning research is improving the performance of DP-SGD on private data by incorporating priors that are learned on real-world public data.
In this work, we explore how we can improve the privacy-utility tradeoff of DP-SGD by learning priors from images generated by random processes and transferring these priors to private data.
arXiv Detail & Related papers (2023-06-08T04:14:32Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Individual Privacy Accounting for Differentially Private Stochastic Gradient Descent [69.14164921515949]
We characterize privacy guarantees for individual examples when releasing models trained by DP-SGD.
We find that most examples enjoy stronger privacy guarantees than the worst-case bound.
This implies groups that are underserved in terms of model utility simultaneously experience weaker privacy guarantees.
arXiv Detail & Related papers (2022-06-06T13:49:37Z) - Do Not Let Privacy Overbill Utility: Gradient Embedding Perturbation for
Private Learning [74.73901662374921]
A differentially private model degrades the utility drastically when the model comprises a large number of trainable parameters.
We propose an algorithm emphGradient Embedding Perturbation (GEP) towards training differentially private deep models with decent accuracy.
arXiv Detail & Related papers (2021-02-25T04:29:58Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Local Differential Privacy for Regret Minimization in Reinforcement
Learning [33.679678503441565]
We study privacy in the context of finite-horizon Markov Decision Processes (MDPs)
We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework.
We present an optimistic algorithm that simultaneously satisfies $varepsilon$-LDP requirements.
arXiv Detail & Related papers (2020-10-15T14:13:26Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.