KL-regularization Itself is Differentially Private in Bandits and RLHF
- URL: http://arxiv.org/abs/2505.18407v1
- Date: Fri, 23 May 2025 22:22:02 GMT
- Title: KL-regularization Itself is Differentially Private in Bandits and RLHF
- Authors: Yizhou Zhang, Kishan Panaganti, Laixi Shi, Juba Ziani, Adam Wierman,
- Abstract summary: Differential Privacy (DP) provides a rigorous framework for privacy, ensuring the outputs of data-driven algorithms remain statistically indistinguishable across datasets that differ in a single entry.<n>While guaranteeing DP generally requires explicitly injecting noise either to the algorithm itself or to its outputs, the intrinsic randomness of existing algorithms presents an opportunity to achieve DP for free''
- Score: 19.463863037999054
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Differential Privacy (DP) provides a rigorous framework for privacy, ensuring the outputs of data-driven algorithms remain statistically indistinguishable across datasets that differ in a single entry. While guaranteeing DP generally requires explicitly injecting noise either to the algorithm itself or to its outputs, the intrinsic randomness of existing algorithms presents an opportunity to achieve DP ``for free''. In this work, we explore the role of regularization in achieving DP across three different decision-making problems: multi-armed bandits, linear contextual bandits, and reinforcement learning from human feedback (RLHF), in offline data settings. We show that adding KL-regularization to the learning objective (a common approach in optimization algorithms) makes the action sampled from the resulting stochastic policy itself differentially private. This offers a new route to privacy guarantees without additional noise injection, while also preserving the inherent advantage of regularization in enhancing performance.
Related papers
- Forward Learning with Differential Privacy [27.164507868291913]
We propose a blueprivatized forward-learning algorithm, Differential Private Unified Likelihood Ratio (DP-ULR)<n>Our experiments indicate that DP-ULR achieves competitive performance compared to traditional differential privacy training algorithms based on backpropagation.
arXiv Detail & Related papers (2025-04-01T04:14:53Z) - Federated Learning with Differential Privacy: An Utility-Enhanced Approach [12.614480013684759]
Federated learning has emerged as an attractive approach to protect data privacy by eliminating the need for sharing clients' data.<n>Recent studies have shown that federated learning alone does not guarantee privacy, as private data may still be inferred from the uploaded parameters to the central server.<n>We present a modification to these vanilla differentially private algorithms based on a Haar wavelet transformation step and a novel noise injection scheme that significantly lowers the bound of the noise variance.
arXiv Detail & Related papers (2025-03-27T04:48:29Z) - Linear-Time User-Level DP-SCO via Robust Statistics [55.350093142673316]
User-level differentially private convex optimization (DP-SCO) has garnered significant attention due to the importance of safeguarding user privacy in machine learning applications.<n>Current methods, such as those based on differentially private gradient descent (DP-SGD), often struggle with high noise accumulation and suboptimal utility.<n>We introduce a novel linear-time algorithm that leverages robust statistics, specifically the median and trimmed mean, to overcome these challenges.
arXiv Detail & Related papers (2025-02-13T02:05:45Z) - Differentially Private Policy Gradient [48.748194765816955]
We show that it is possible to find the right trade-off between privacy noise and trust-region size to obtain a performant differentially private policy gradient algorithm.<n>Our results and the complexity of the tasks addressed represent a significant improvement over existing DP algorithms in online RL.
arXiv Detail & Related papers (2025-01-31T12:11:13Z) - Differentially Private Random Feature Model [52.468511541184895]
We produce a differentially private random feature model for privacy-preserving kernel machines.<n>We show that our method preserves privacy and derive a generalization error bound for the method.
arXiv Detail & Related papers (2024-12-06T05:31:08Z) - Noise Variance Optimization in Differential Privacy: A Game-Theoretic Approach Through Per-Instance Differential Privacy [7.264378254137811]
Differential privacy (DP) can measure privacy loss by observing the changes in the distribution caused by the inclusion of individuals in the target dataset.
DP has been prominent in safeguarding datasets in machine learning in industry giants like Apple and Google.
We propose per-instance DP (pDP) as a constraint, measuring privacy loss for each data instance and optimizing noise tailored to individual instances.
arXiv Detail & Related papers (2024-04-24T06:51:16Z) - Dynamic Privacy Allocation for Locally Differentially Private Federated
Learning with Composite Objectives [10.528569272279999]
This paper proposes a differentially private federated learning algorithm for strongly convex but possibly nonsmooth problems.
The proposed algorithm adds artificial noise to the shared information to ensure privacy and dynamically allocates the time-varying noise variance to minimize an upper bound of the optimization error.
Numerical results show the superiority of the proposed algorithm over state-of-the-art methods.
arXiv Detail & Related papers (2023-08-02T13:30:33Z) - Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection.
This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z) - Adaptive Differentially Private Empirical Risk Minimization [95.04948014513226]
We propose an adaptive (stochastic) gradient perturbation method for differentially private empirical risk minimization.
We prove that the ADP method considerably improves the utility guarantee compared to the standard differentially private method in which vanilla random noise is added.
arXiv Detail & Related papers (2021-10-14T15:02:20Z) - No-Regret Algorithms for Private Gaussian Process Bandit Optimization [13.660643701487002]
We consider the ubiquitous problem of gaussian process (GP) bandit optimization from the lens of privacy-preserving statistics.
We propose a solution for differentially private GP bandit optimization that combines a uniform kernel approximator with random perturbations.
Our algorithms maintain differential privacy throughout the optimization procedure and critically do not rely explicitly on the sample path for prediction.
arXiv Detail & Related papers (2021-02-24T18:52:24Z) - RDP-GAN: A R\'enyi-Differential Privacy based Generative Adversarial
Network [75.81653258081435]
Generative adversarial network (GAN) has attracted increasing attention recently owing to its impressive ability to generate realistic samples with high privacy protection.
However, when GANs are applied on sensitive or private training examples, such as medical or financial records, it is still probable to divulge individuals' sensitive and private information.
We propose a R'enyi-differentially private-GAN (RDP-GAN), which achieves differential privacy (DP) in a GAN by carefully adding random noises on the value of the loss function during training.
arXiv Detail & Related papers (2020-07-04T09:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.