Local Differential Privacy for Regret Minimization in Reinforcement
Learning
- URL: http://arxiv.org/abs/2010.07778v3
- Date: Wed, 27 Oct 2021 12:46:21 GMT
- Title: Local Differential Privacy for Regret Minimization in Reinforcement
Learning
- Authors: Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke, Matteo Pirotta
- Abstract summary: We study privacy in the context of finite-horizon Markov Decision Processes (MDPs)
We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework.
We present an optimistic algorithm that simultaneously satisfies $varepsilon$-LDP requirements.
- Score: 33.679678503441565
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning algorithms are widely used in domains where it is
desirable to provide a personalized service. In these domains it is common that
user data contains sensitive information that needs to be protected from third
parties. Motivated by this, we study privacy in the context of finite-horizon
Markov Decision Processes (MDPs) by requiring information to be obfuscated on
the user side. We formulate this notion of privacy for RL by leveraging the
local differential privacy (LDP) framework. We establish a lower bound for
regret minimization in finite-horizon MDPs with LDP guarantees which shows that
guaranteeing privacy has a multiplicative effect on the regret. This result
shows that while LDP is an appealing notion of privacy, it makes the learning
problem significantly more complex. Finally, we present an optimistic algorithm
that simultaneously satisfies $\varepsilon$-LDP requirements, and achieves
$\sqrt{K}/\varepsilon$ regret in any finite-horizon MDP after $K$ episodes,
matching the lower bound dependency on the number of episodes $K$.
Related papers
- Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy [55.357715095623554]
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties.
We propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification.
arXiv Detail & Related papers (2024-10-24T03:39:55Z) - Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied.
Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Improved Regret for Differentially Private Exploration in Linear MDP [31.567811502343552]
We study privacy-preserving exploration in sequential decision-making for environments that rely on sensitive data such as medical records.
We provide a private algorithm with an improved regret rate with an optimal dependence of $O(sqrtK)$ on the number of episodes.
arXiv Detail & Related papers (2022-02-02T21:32:09Z) - Differentially Private Reinforcement Learning with Linear Function
Approximation [3.42658286826597]
We study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP)
Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers.
arXiv Detail & Related papers (2022-01-18T15:25:24Z) - Differentially Private Regret Minimization in Episodic Markov Decision
Processes [6.396288020763144]
We study regret in finite horizon tabular Markov decision processes (MDPs) under the constraints of differential privacy (DP)
This is motivated by the widespread applications of reinforcement learning (RL) in real-world sequential decision making problems.
arXiv Detail & Related papers (2021-12-20T15:12:23Z) - Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP)
We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy.
Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.