Differentially Private Reinforcement Learning with Linear Function
Approximation
- URL: http://arxiv.org/abs/2201.07052v1
- Date: Tue, 18 Jan 2022 15:25:24 GMT
- Title: Differentially Private Reinforcement Learning with Linear Function
Approximation
- Authors: Xingyu Zhou
- Abstract summary: We study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP)
Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers.
- Score: 3.42658286826597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Motivated by the wide adoption of reinforcement learning (RL) in real-world
personalized services, where users' sensitive and private information needs to
be protected, we study regret minimization in finite-horizon Markov decision
processes (MDPs) under the constraints of differential privacy (DP). Compared
to existing private RL algorithms that work only on tabular finite-state,
finite-actions MDPs, we take the first step towards privacy-preserving learning
in MDPs with large state and action spaces. Specifically, we consider MDPs with
linear function approximation (in particular linear mixture MDPs) under the
notion of joint differential privacy (JDP), where the RL agent is responsible
for protecting users' sensitive data. We design two private RL algorithms that
are based on value iteration and policy optimization, respectively, and show
that they enjoy sub-linear regret performance while guaranteeing privacy
protection. Moreover, the regret bounds are independent of the number of
states, and scale at most logarithmically with the number of actions, making
the algorithms suitable for privacy protection in nowadays large-scale
personalized services. Our results are achieved via a general procedure for
learning in linear mixture MDPs under changing regularizers, which not only
generalizes previous results for non-private learning, but also serves as a
building block for general private reinforcement learning.
Related papers
- Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy [55.357715095623554]
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties.
We propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification.
arXiv Detail & Related papers (2024-10-24T03:39:55Z) - Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit.
We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z) - Differentially Private Reinforcement Learning with Self-Play [18.124829682487558]
We study the problem of multi-agent reinforcement learning (multi-agent RL) with differential privacy (DP) constraints.
We first extend the definitions of Joint DP (JDP) and Local DP (LDP) to two-player zero-sum episodic Markov Games.
We design a provably efficient algorithm based on optimistic Nash value and privatization of Bernstein-type bonuses.
arXiv Detail & Related papers (2024-04-11T08:42:51Z) - Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms.
Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z) - Differentially Private Deep Model-Based Reinforcement Learning [47.651861502104715]
We introduce PriMORL, a model-based RL algorithm with formal differential privacy guarantees.
PriMORL learns an ensemble of trajectory-level DP models of the environment from offline data.
arXiv Detail & Related papers (2024-02-08T10:05:11Z) - Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection.
This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z) - Differentially Private Regret Minimization in Episodic Markov Decision
Processes [6.396288020763144]
We study regret in finite horizon tabular Markov decision processes (MDPs) under the constraints of differential privacy (DP)
This is motivated by the widespread applications of reinforcement learning (RL) in real-world sequential decision making problems.
arXiv Detail & Related papers (2021-12-20T15:12:23Z) - Local Differential Privacy for Regret Minimization in Reinforcement
Learning [33.679678503441565]
We study privacy in the context of finite-horizon Markov Decision Processes (MDPs)
We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework.
We present an optimistic algorithm that simultaneously satisfies $varepsilon$-LDP requirements.
arXiv Detail & Related papers (2020-10-15T14:13:26Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.