Adaptive Control of Differentially Private Linear Quadratic Systems
- URL: http://arxiv.org/abs/2108.11563v1
- Date: Thu, 26 Aug 2021 03:06:22 GMT
- Title: Adaptive Control of Differentially Private Linear Quadratic Systems
- Authors: Sayak Ray Chowdhury, Xingyu Zhou and Ness Shroff
- Abstract summary: We study the problem of regret in reinforcement learning (RL) under differential privacy constraints.
We develop the first private RL algorithm, PRL, which is able to attain a sub-linear regret while guaranteeing privacy protection.
- Score: 5.414308305392762
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we study the problem of regret minimization in reinforcement
learning (RL) under differential privacy constraints. This work is motivated by
the wide range of RL applications for providing personalized service, where
privacy concerns are becoming paramount. In contrast to previous works, we take
the first step towards non-tabular RL settings, while providing a rigorous
privacy guarantee. In particular, we consider the adaptive control of
differentially private linear quadratic (LQ) systems. We develop the first
private RL algorithm, PRL, which is able to attain a sub-linear regret while
guaranteeing privacy protection. More importantly, the additional cost due to
privacy is only on the order of $\frac{\ln(1/\delta)^{1/4}}{\epsilon^{1/2}}$
given privacy parameters $\epsilon, \delta > 0$. Through this process, we also
provide a general procedure for adaptive control of LQ systems under changing
regularizers, which not only generalizes previous non-private controls, but
also serves as the basis for general private controls.
Related papers
- Differentially Private Model-Based Offline Reinforcement Learning [51.1231068185106]
We introduce DP-MORL, an algorithm coming with differential privacy guarantees.
A private model of the environment is first learned from offline data.
We then use model-based policy optimization to derive a policy from the private model.
arXiv Detail & Related papers (2024-02-08T10:05:11Z) - Differentially Private High Dimensional Bandits [1.3597551064547502]
We present PrivateLASSO, a differentially private LASSO bandit algorithm.
PrivateLASSO is based on two sub-routines: (i) a sparse hard-thresholding-based privacy mechanism and (ii) an episodic thresholding rule for identifying the support of the parameter $theta$.
arXiv Detail & Related papers (2024-02-06T06:10:46Z) - Private Fine-tuning of Large Language Models with Zeroth-order
Optimization [54.24600476755372]
We introduce DP-ZO, a new method for fine-tuning large language models that preserves the privacy of training data by privatizing zeroth-order optimization.
We show that DP-ZO exhibits just $1.86%$ performance degradation due to privacy at $ (1,10-5)$-DP when fine-tuning OPT-66B on 1000 training samples from SQuAD.
arXiv Detail & Related papers (2024-01-09T03:53:59Z) - adaPARL: Adaptive Privacy-Aware Reinforcement Learning for
Sequential-Decision Making Human-in-the-Loop Systems [0.5414308305392761]
Reinforcement learning (RL) presents numerous benefits compared to rule-based approaches in various applications.
We propose adaPARL, an adaptive approach for privacy-aware RL, especially for human-in-the-loop IoT systems.
AdaPARL provides a personalized privacy-utility trade-off depending on human behavior and preference.
arXiv Detail & Related papers (2023-03-07T21:55:22Z) - On Differentially Private Online Predictions [74.01773626153098]
We introduce an interactive variant of joint differential privacy towards handling online processes.
We demonstrate that it satisfies (suitable variants) of group privacy, composition, and post processing.
We then study the cost of interactive joint privacy in the basic setting of online classification.
arXiv Detail & Related papers (2023-02-27T19:18:01Z) - Algorithms with More Granular Differential Privacy Guarantees [65.3684804101664]
We consider partial differential privacy (DP), which allows quantifying the privacy guarantee on a per-attribute basis.
In this work, we study several basic data analysis and learning tasks, and design algorithms whose per-attribute privacy parameter is smaller that the best possible privacy parameter for the entire record of a person.
arXiv Detail & Related papers (2022-09-08T22:43:50Z) - On Differential Privacy for Federated Learning in Wireless Systems with
Multiple Base Stations [90.53293906751747]
We consider a federated learning model in a wireless system with multiple base stations and inter-cell interference.
We show the convergence behavior of the learning process by deriving an upper bound on its optimality gap.
Our proposed scheduler improves the average accuracy of the predictions compared with a random scheduler.
arXiv Detail & Related papers (2022-08-25T03:37:11Z) - Improved Regret for Differentially Private Exploration in Linear MDP [31.567811502343552]
We study privacy-preserving exploration in sequential decision-making for environments that rely on sensitive data such as medical records.
We provide a private algorithm with an improved regret rate with an optimal dependence of $O(sqrtK)$ on the number of episodes.
arXiv Detail & Related papers (2022-02-02T21:32:09Z) - Differentially Private Reinforcement Learning with Linear Function
Approximation [3.42658286826597]
We study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP)
Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers.
arXiv Detail & Related papers (2022-01-18T15:25:24Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.