Related papers: Offline Reinforcement Learning with Differential Privacy

Offline Reinforcement Learning with Differential Privacy

URL: http://arxiv.org/abs/2206.00810v1
Date: Thu, 2 Jun 2022 00:45:04 GMT
Title: Offline Reinforcement Learning with Differential Privacy
Authors: Dan Qiao, Yu-Xiang Wang
Abstract summary: offline reinforcement learning problem is often motivated by the need to learn data-driven decision policies in financial, legal and healthcare applications. We design offline RL algorithms with differential privacy guarantees which provably prevent such risks.
Score: 16.871660060209674
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The offline reinforcement learning (RL) problem is often motivated by the need to learn data-driven decision policies in financial, legal and healthcare applications. However, the learned policy could retain sensitive information of individuals in the training data (e.g., treatment and outcome of patients), thus susceptible to various privacy risks. We design offline RL algorithms with differential privacy guarantees which provably prevent such risks. These algorithms also enjoy strong instance-dependent learning bounds under both tabular and linear Markov decision process (MDP) settings. Our theory and simulation suggest that the privacy guarantee comes at (almost) no drop in utility comparing to the non-private counterpart for a medium-size dataset.

Related papers

Can Differentially Private Fine-tuning LLMs Protect Against Privacy Attacks? [8.189149471520542]
Fine-tuning large language models (LLMs) has become an essential strategy for adapting them to specialized tasks. Although differential privacy (DP) offers strong theoretical guarantees against such leakage, its empirical privacy effectiveness on LLMs remains unclear. This paper systematically investigates the impact of DP across fine-tuning methods and privacy budgets.
arXiv Detail & Related papers (2025-04-28T05:34:53Z)
Differentially Private Policy Gradient [48.748194765816955]
We show that it is possible to find the right trade-off between privacy noise and trust-region size to obtain a performant differentially private policy gradient algorithm. Our results and the complexity of the tasks addressed represent a significant improvement over existing DP algorithms in online RL.
arXiv Detail & Related papers (2025-01-31T12:11:13Z)
Preserving Expert-Level Privacy in Offline Reinforcement Learning [35.486119057117996]
We propose a consensus-based expert-level differentially private offline RL training approach compatible with any existing offline RL algorithm. We prove rigorous differential privacy guarantees, while maintaining strong empirical performance.
arXiv Detail & Related papers (2024-11-18T21:26:53Z)
Masked Differential Privacy [64.32494202656801]
We propose an effective approach called masked differential privacy (DP), which allows for controlling sensitive regions where differential privacy is applied. Our method operates selectively on data and allows for defining non-sensitive-temporal regions without DP application or combining differential privacy with other privacy techniques within data samples.
arXiv Detail & Related papers (2024-10-22T15:22:53Z)
Differentially Private Deep Model-Based Reinforcement Learning [47.651861502104715]
We introduce PriMORL, a model-based RL algorithm with formal differential privacy guarantees. PriMORL learns an ensemble of trajectory-level DP models of the environment from offline data.
arXiv Detail & Related papers (2024-02-08T10:05:11Z)
A Unified View of Differentially Private Deep Generative Modeling [60.72161965018005]
Data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles is key for technological progress in many real-world application scenarios that involve privacy sensitive data. Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released.
arXiv Detail & Related papers (2023-09-27T14:38:16Z)
Locally Differentially Private Distributed Online Learning with Guaranteed Optimality [1.800614371653704]
This paper proposes an approach that ensures both differential privacy and learning accuracy in distributed online learning. While ensuring a diminishing expected instantaneous regret, the approach can simultaneously ensure a finite cumulative privacy budget. To the best of our knowledge, this is the first algorithm that successfully ensures both rigorous local differential privacy and learning accuracy.
arXiv Detail & Related papers (2023-06-25T02:05:34Z)
Differentially Private Reinforcement Learning with Linear Function Approximation [3.42658286826597]
We study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP) Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers.
arXiv Detail & Related papers (2022-01-18T15:25:24Z)
Distributed Machine Learning and the Semblance of Trust [66.1227776348216]
Federated Learning (FL) allows the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind.
arXiv Detail & Related papers (2021-12-21T08:44:05Z)
On Deep Learning with Label Differential Privacy [54.45348348861426]
We study the multi-class classification setting where the labels are considered sensitive and ought to be protected. We propose a new algorithm for training deep neural networks with label differential privacy, and run evaluations on several datasets.
arXiv Detail & Related papers (2021-02-11T15:09:06Z)
Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL) We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP) We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.