Secure Reinforcement Learning via Shuffle Privacy Model
- URL: http://arxiv.org/abs/2411.11647v2
- Date: Tue, 26 Aug 2025 07:33:53 GMT
- Title: Secure Reinforcement Learning via Shuffle Privacy Model
- Authors: Shaojie Bai, Mohammad Sadegh Talebi, Chengcheng Zhao, Peng Cheng, Jiming Chen,
- Abstract summary: We present Shuffle Differentially Private Policy Elimination, the first generic RL algorithm for episodic learning.<n>Our analysis shows that SDPPE achieves a near-optimal regret bound, demonstrating a superior privacy-regret trade-off that significantly outperforms the local model.<n>This work establishes the viability of the shuffle model for secure data-driven control in advanced CPS.
- Score: 23.688680166406627
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reinforcement learning (RL) is a powerful tool for sequential decision-making, but its application is often hindered by privacy concerns arising from its interaction data. This challenge is particularly acute in advanced Cyber-Physical Systems (CPS), where learning from operational and user data can expose systems to privacy inference attacks. Existing differential privacy (DP) models for RL are often inadequate: the centralized model requires a fully trusted server, creating a single point of failure risk, while the local model incurs significant performance degradation that is unsuitable for many control applications. This paper addresses this gap by leveraging the emerging shuffle model of privacy, an intermediate trust model that provides strong privacy guarantees without a centralized trust assumption. We present Shuffle Differentially Private Policy Elimination (SDP-PE), the first generic policy elimination-based algorithm for episodic RL under the shuffle model. Our method introduces a novel exponential batching schedule and a ``forgetting'' mechanism to balance the competing demands of privacy and learning performance. Our analysis shows that SDP-PE achieves a near-optimal regret bound, demonstrating a superior privacy-regret trade-off that significantly outperforms the local model. This work establishes the viability of the shuffle model for secure data-driven control in advanced CPS.
Related papers
- Your Privacy Depends on Others: Collusion Vulnerabilities in Individual Differential Privacy [50.66105844449181]
Individual Differential Privacy (iDP) promises users control over their privacy, but this promise can be broken in practice.<n>We reveal a previously overlooked vulnerability in sampling-based iDP mechanisms.<n>We propose $(varepsilon_i,_i,overline)$-iDP a privacy contract that uses $$-divergences to provide users with a hard upper bound on their excess vulnerability.
arXiv Detail & Related papers (2026-01-19T10:26:12Z) - Federated Learning with Enhanced Privacy via Model Splitting and Random Client Participation [7.780051713043537]
Federated Learning (FL) often adopts differential privacy (DP) to protect client data, but the added noise required for privacy guarantees can substantially degrade model accuracy.<n>We propose model-splitting privacy-amplified federated learning (MS-PAFL)<n>In this framework, each client's model is partitioned into a private submodel, retained locally, and a public submodel, shared for global aggregation.
arXiv Detail & Related papers (2025-09-30T07:51:06Z) - Enhancing Model Privacy in Federated Learning with Random Masking and Quantization [46.915409150222494]
The rise of large language models (LLMs) has introduced new challenges in distributed systems.<n>This highlights the need for a federated learning approach that can safeguard both sensitive data and proprietary models.<n>We propose FedQSN, a federated learning approach that leverages random masking to obscure a subnetwork of model parameters.
arXiv Detail & Related papers (2025-08-26T10:34:13Z) - Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.
We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z) - Pseudo-Probability Unlearning: Towards Efficient and Privacy-Preserving Machine Unlearning [59.29849532966454]
We propose PseudoProbability Unlearning (PPU), a novel method that enables models to forget data to adhere to privacy-preserving manner.
Our method achieves over 20% improvements in forgetting error compared to the state-of-the-art.
arXiv Detail & Related papers (2024-11-04T21:27:06Z) - Camel: Communication-Efficient and Maliciously Secure Federated Learning in the Shuffle Model of Differential Privacy [9.100955087185811]
Federated learning (FL) has rapidly become a compelling paradigm that enables multiple clients to jointly train a model by sharing only gradient updates for aggregation.
In order to protect the gradient updates which could also be privacy-sensitive, there has been a line of work studying local differential privacy mechanisms.
We present Camel, a new communication-efficient and maliciously secure FL framework in the shuffle model of DP.
arXiv Detail & Related papers (2024-10-04T13:13:44Z) - Enhanced Privacy Bound for Shuffle Model with Personalized Privacy [32.08637708405314]
Differential Privacy (DP) is an enhanced privacy protocol which introduces an intermediate trusted server between local users and a central data curator.
It significantly amplifies the central DP guarantee by anonymizing and shuffling the local randomized data.
This work focuses on deriving the central privacy bound for a more practical setting where personalized local privacy is required by each user.
arXiv Detail & Related papers (2024-07-25T16:11:56Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Differentially Private Aggregation via Imperfect Shuffling [64.19885806149958]
We introduce the imperfect shuffle differential privacy model, where messages are shuffled in an almost uniform manner before being observed by a curator for private aggregation.
We show that surprisingly, there is no additional error overhead necessary in the imperfect shuffle model.
arXiv Detail & Related papers (2023-08-28T17:34:52Z) - Echo of Neighbors: Privacy Amplification for Personalized Private
Federated Learning with Shuffle Model [21.077469463027306]
Federated Learning, as a popular paradigm for collaborative training, is vulnerable to privacy attacks.
This work builds up to strengthen model privacy under personalized local privacy by leveraging the privacy amplification effect of the shuffle model.
To the best of our knowledge, the impact of shuffling on personalized local privacy is considered for the first time.
arXiv Detail & Related papers (2023-04-11T21:48:42Z) - TAN Without a Burn: Scaling Laws of DP-SGD [70.7364032297978]
Differentially Private methods for training Deep Neural Networks (DNNs) have progressed recently.
We decouple privacy analysis and experimental behavior of noisy training to explore the trade-off with minimal computational requirements.
We apply the proposed method on CIFAR-10 and ImageNet and, in particular, strongly improve the state-of-the-art on ImageNet with a +9 points gain in top-1 accuracy.
arXiv Detail & Related papers (2022-10-07T08:44:35Z) - Tight Differential Privacy Guarantees for the Shuffle Model with $k$-Randomized Response [6.260747047974035]
Most differentially private (DP) algorithms assume a third party inserts noise to queries made on datasets, or a local model where the users locally perturb their data.
The recently proposed shuffle model is an intermediate framework between the central and the local paradigms.
We perform experiments on both synthetic and real data to compare the privacy-utility trade-off of the shuffle model with that of the central one privatized.
arXiv Detail & Related papers (2022-05-18T10:44:28Z) - Large Scale Transfer Learning for Differentially Private Image
Classification [51.10365553035979]
Differential Privacy (DP) provides a formal framework for training machine learning models with individual example level privacy.
Private training using DP-SGD protects against leakage by injecting noise into individual example gradients.
While this result is quite appealing, the computational cost of training large-scale models with DP-SGD is substantially higher than non-private training.
arXiv Detail & Related papers (2022-05-06T01:22:20Z) - Just Fine-tune Twice: Selective Differential Privacy for Large Language
Models [69.66654761324702]
We propose a simple yet effective just-fine-tune-twice privacy mechanism to achieve SDP for large Transformer-based language models.
Experiments show that our models achieve strong performance while staying robust to the canary insertion attack.
arXiv Detail & Related papers (2022-04-15T22:36:55Z) - Network Shuffling: Privacy Amplification via Random Walks [21.685747588753514]
We introduce network shuffling, a decentralized mechanism where users exchange data in a random-walk fashion on a network/graph.
We show that the privacy amplification rate is similar to other privacy amplification techniques such as uniform shuffling.
arXiv Detail & Related papers (2022-04-08T08:36:06Z) - Shuffle Private Linear Contextual Bandits [9.51828574518325]
We propose a general framework for linear contextual bandits under the shuffle algorithmic trust model.
We prove that both instantiations lead to regret guarantees that significantly improve on that of the local model.
We also verify this regret behavior with simulations on synthetic data.
arXiv Detail & Related papers (2022-02-11T11:53:22Z) - Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP)
We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy.
Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z) - Don't Generate Me: Training Differentially Private Generative Models
with Sinkhorn Divergence [73.14373832423156]
We propose DP-Sinkhorn, a novel optimal transport-based generative method for learning data distributions from private data with differential privacy.
Unlike existing approaches for training differentially private generative models, we do not rely on adversarial objectives.
arXiv Detail & Related papers (2021-11-01T18:10:21Z) - PRECAD: Privacy-Preserving and Robust Federated Learning via
Crypto-Aided Differential Privacy [14.678119872268198]
Federated Learning (FL) allows multiple participating clients to train machine learning models collaboratively by keeping their datasets local and only exchanging model updates.
Existing FL protocol designs have been shown to be vulnerable to attacks that aim to compromise data privacy and/or model robustness.
We develop a framework called PRECAD, which simultaneously achieves differential privacy (DP) and enhances robustness against model poisoning attacks with the help of cryptography.
arXiv Detail & Related papers (2021-10-22T04:08:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.