Related papers: Differentially Private Reinforcement Learning with Linear Function Approximation

Differentially Private Reinforcement Learning with Linear Function Approximation

URL: http://arxiv.org/abs/2201.07052v1
Date: Tue, 18 Jan 2022 15:25:24 GMT
Title: Differentially Private Reinforcement Learning with Linear Function Approximation
Authors: Xingyu Zhou
Abstract summary: We study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP) Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers.
Score: 3.42658286826597
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services, where users' sensitive and private information needs to be protected, we study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifically, we consider MDPs with linear function approximation (in particular linear mixture MDPs) under the notion of joint differential privacy (JDP), where the RL agent is responsible for protecting users' sensitive data. We design two private RL algorithms that are based on value iteration and policy optimization, respectively, and show that they enjoy sub-linear regret performance while guaranteeing privacy protection. Moreover, the regret bounds are independent of the number of states, and scale at most logarithmically with the number of actions, making the algorithms suitable for privacy protection in nowadays large-scale personalized services. Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers, which not only generalizes previous results for non-private learning, but also serves as a building block for general private reinforcement learning.

Related papers

Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications. Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility. We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z)
Differentially Private Policy Gradient [48.748194765816955]
We show that it is possible to find the right trade-off between privacy noise and trust-region size to obtain a performant differentially private policy gradient algorithm. Our results and the complexity of the tasks addressed represent a significant improvement over existing DP algorithms in online RL.
arXiv Detail & Related papers (2025-01-31T12:11:13Z)
Enhancing Feature-Specific Data Protection via Bayesian Coordinate Differential Privacy [55.357715095623554]
Local Differential Privacy (LDP) offers strong privacy guarantees without requiring users to trust external parties. We propose a Bayesian framework, Bayesian Coordinate Differential Privacy (BCDP), that enables feature-specific privacy quantification.
arXiv Detail & Related papers (2024-10-24T03:39:55Z)
Mind the Privacy Unit! User-Level Differential Privacy for Language Model Fine-Tuning [62.224804688233]
differential privacy (DP) offers a promising solution by ensuring models are 'almost indistinguishable' with or without any particular privacy unit. We study user-level DP motivated by applications where it necessary to ensure uniform privacy protection across users.
arXiv Detail & Related papers (2024-06-20T13:54:32Z)
Differentially Private Reinforcement Learning with Self-Play [18.124829682487558]
We study the problem of multi-agent reinforcement learning (multi-agent RL) with differential privacy (DP) constraints. We first extend the definitions of Joint DP (JDP) and Local DP (LDP) to two-player zero-sum episodic Markov Games. We design a provably efficient algorithm based on optimistic Nash value and privatization of Bernstein-type bonuses.
arXiv Detail & Related papers (2024-04-11T08:42:51Z)
Provable Privacy with Non-Private Pre-Processing [56.770023668379615]
We propose a general framework to evaluate the additional privacy cost incurred by non-private data-dependent pre-processing algorithms. Our framework establishes upper bounds on the overall privacy guarantees by utilising two new technical notions.
arXiv Detail & Related papers (2024-03-19T17:54:49Z)
Differentially Private Deep Model-Based Reinforcement Learning [47.651861502104715]
We introduce PriMORL, a model-based RL algorithm with formal differential privacy guarantees. PriMORL learns an ensemble of trajectory-level DP models of the environment from offline data.
arXiv Detail & Related papers (2024-02-08T10:05:11Z)
Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection. This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z)
Differentially Private Regret Minimization in Episodic Markov Decision Processes [6.396288020763144]
We study regret in finite horizon tabular Markov decision processes (MDPs) under the constraints of differential privacy (DP) This is motivated by the widespread applications of reinforcement learning (RL) in real-world sequential decision making problems.
arXiv Detail & Related papers (2021-12-20T15:12:23Z)
Local Differential Privacy for Regret Minimization in Reinforcement Learning [33.679678503441565]
We study privacy in the context of finite-horizon Markov Decision Processes (MDPs) We formulate this notion of privacy for RL by leveraging the local differential privacy (LDP) framework. We present an optimistic algorithm that simultaneously satisfies $varepsilon$-LDP requirements.
arXiv Detail & Related papers (2020-10-15T14:13:26Z)
Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL) We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP) We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.