No-regret Exploration in Shuffle Private Reinforcement Learning
- URL: http://arxiv.org/abs/2411.11647v1
- Date: Mon, 18 Nov 2024 15:24:11 GMT
- Title: No-regret Exploration in Shuffle Private Reinforcement Learning
- Authors: Shaojie Bai, Mohammad Sadegh Talebi, Chengcheng Zhao, Peng Cheng, Jiming Chen,
- Abstract summary: Differential privacy (DP) has been introduced into episodic reinforcement learning (RL) to address user privacy concerns in personalized services.
We present the first generic algorithm for episodic RL under the shuffle model, where a trusted shuffler randomly permutes a batch of users' data before sending it to the central agent.
Our analysis shows that the algorithm achieves a near-optimal regret bound comparable to that of the centralized model and significantly outperforms the local model in terms of privacy cost.
- Score: 18.142491344065046
- License:
- Abstract: Differential privacy (DP) has recently been introduced into episodic reinforcement learning (RL) to formally address user privacy concerns in personalized services. Previous work mainly focuses on two trust models of DP: the central model, where a central agent is responsible for protecting users' sensitive data, and the (stronger) local model, where the protection occurs directly on the user side. However, they either require a trusted central agent or incur a significantly higher privacy cost, making it unsuitable for many scenarios. This work introduces a trust model stronger than the central model but with a lower privacy cost than the local model, leveraging the emerging \emph{shuffle} model of privacy. We present the first generic algorithm for episodic RL under the shuffle model, where a trusted shuffler randomly permutes a batch of users' data before sending it to the central agent. We then instantiate the algorithm using our proposed shuffle Privatizer, relying on a shuffle private binary summation mechanism. Our analysis shows that the algorithm achieves a near-optimal regret bound comparable to that of the centralized model and significantly outperforms the local model in terms of privacy cost.
Related papers
- Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.
We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z) - Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Enhanced Privacy Bound for Shuffle Model with Personalized Privacy [32.08637708405314]
Differential Privacy (DP) is an enhanced privacy protocol which introduces an intermediate trusted server between local users and a central data curator.
It significantly amplifies the central DP guarantee by anonymizing and shuffling the local randomized data.
This work focuses on deriving the central privacy bound for a more practical setting where personalized local privacy is required by each user.
arXiv Detail & Related papers (2024-07-25T16:11:56Z) - Differentially Private Aggregation via Imperfect Shuffling [64.19885806149958]
We introduce the imperfect shuffle differential privacy model, where messages are shuffled in an almost uniform manner before being observed by a curator for private aggregation.
We show that surprisingly, there is no additional error overhead necessary in the imperfect shuffle model.
arXiv Detail & Related papers (2023-08-28T17:34:52Z) - Echo of Neighbors: Privacy Amplification for Personalized Private
Federated Learning with Shuffle Model [21.077469463027306]
Federated Learning, as a popular paradigm for collaborative training, is vulnerable to privacy attacks.
This work builds up to strengthen model privacy under personalized local privacy by leveraging the privacy amplification effect of the shuffle model.
To the best of our knowledge, the impact of shuffling on personalized local privacy is considered for the first time.
arXiv Detail & Related papers (2023-04-11T21:48:42Z) - Tight Differential Privacy Guarantees for the Shuffle Model with $k$-Randomized Response [6.260747047974035]
Most differentially private (DP) algorithms assume a third party inserts noise to queries made on datasets, or a local model where the users locally perturb their data.
The recently proposed shuffle model is an intermediate framework between the central and the local paradigms.
We perform experiments on both synthetic and real data to compare the privacy-utility trade-off of the shuffle model with that of the central one privatized.
arXiv Detail & Related papers (2022-05-18T10:44:28Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Network Shuffling: Privacy Amplification via Random Walks [21.685747588753514]
We introduce network shuffling, a decentralized mechanism where users exchange data in a random-walk fashion on a network/graph.
We show that the privacy amplification rate is similar to other privacy amplification techniques such as uniform shuffling.
arXiv Detail & Related papers (2022-04-08T08:36:06Z) - Shuffle Private Linear Contextual Bandits [9.51828574518325]
We propose a general framework for linear contextual bandits under the shuffle algorithmic trust model.
We prove that both instantiations lead to regret guarantees that significantly improve on that of the local model.
We also verify this regret behavior with simulations on synthetic data.
arXiv Detail & Related papers (2022-02-11T11:53:22Z) - Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP)
We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy.
Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.