Related papers: Local Differential Privacy for Regret Minimization in Reinforcement Learning

Local Differential Privacy for Regret Minimization in Reinforcement Learning

URL: http://arxiv.org/abs/2010.07778v3
Date: Wed, 27 Oct 2021 12:46:21 GMT
Title: Local Differential Privacy for Regret Minimization in Reinforcement Learning
Authors: Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke, Matteo Pirotta
Abstract summary: We study privacy in the context of finite-horizon Markov Decision Processes (MDPs) We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework. We present an optimistic algorithm that simultaneously satisfies $varepsilon$-LDP requirements.
Score: 33.679678503441565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning algorithms are widely used in domains where it is desirable to provide a personalized service. In these domains it is common that user data contains sensitive information that needs to be protected from third parties. Motivated by this, we study privacy in the context of finite-horizon Markov Decision Processes (MDPs) by requiring information to be obfuscated on the user side. We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework. We establish a lower bound for regret minimization in finite-horizon MDPs with LDP guarantees which shows that guaranteeing privacy has a multiplicative effect on the regret. This result shows that while LDP is an appealing notion of privacy, it makes the learning problem significantly more complex. Finally, we present an optimistic algorithm that simultaneously satisfies $\varepsilon$-LDP requirements, and achieves $\sqrt{K}/\varepsilon$ regret in any finite-horizon MDP after $K$ episodes, matching the lower bound dependency on the number of episodes $K$.

Related papers

Machine Learning with Privacy for Protected Attributes [56.44253915927481]
We refine the definition of differential privacy (DP) to create a more general and flexible framework that we call feature differential privacy (FDP)<n>Our definition is simulation-based and allows for both addition/removal and replacement variants of privacy, and can handle arbitrary separation of protected and non-protected features.<n>We apply our framework to various machine learning tasks and show that it can significantly improve the utility of DP-trained models when public features are available.
arXiv Detail & Related papers (2025-06-24T17:53:28Z)
Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy [55.357715095623554]
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. We propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification.
arXiv Detail & Related papers (2024-10-24T03:39:55Z)
Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied. Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
Improved Regret for Differentially Private Exploration in Linear MDP [31.567811502343552]
We study privacy-preserving exploration in sequential decision-making for environments that rely on sensitive data such as medical records. We provide a private algorithm with an improved regret rate with an optimal dependence of $O(sqrtK)$ on the number of episodes.
arXiv Detail & Related papers (2022-02-02T21:32:09Z)
Differentially Private Reinforcement Learning with Linear Function Approximation [3.42658286826597]
We study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP) Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers.
arXiv Detail & Related papers (2022-01-18T15:25:24Z)
Differentially Private Regret Minimization in Episodic Markov Decision Processes [6.396288020763144]
We study regret in finite horizon tabular Markov decision processes (MDPs) under the constraints of differential privacy (DP) This is motivated by the widespread applications of reinforcement learning (RL) in real-world sequential decision making problems.
arXiv Detail & Related papers (2021-12-20T15:12:23Z)
Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP) We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy. Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL) We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP) We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.