DP-NCB: Privacy Preserving Fair Bandits
- URL: http://arxiv.org/abs/2508.03836v1
- Date: Tue, 05 Aug 2025 18:34:00 GMT
- Title: DP-NCB: Privacy Preserving Fair Bandits
- Authors: Dhruv Sarkar, Nishant Pandey, Sayak Ray Chowdhury,
- Abstract summary: We introduce Differentially Private Nash Confidence Bound (DP-NCB)-a novel and unified algorithmic framework.<n>It simultaneously ensures $epsilon$-differential privacy and achieves order-optimal Nash regret, matching known lower bounds up to logarithmic factors.<n>Our results offer a principled foundation for designing bandit algorithms that are both privacy-preserving and fair, making them suitable for high-stakes, socially impactful applications.
- Score: 7.443474354626665
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multi-armed bandit algorithms are fundamental tools for sequential decision-making under uncertainty, with widespread applications across domains such as clinical trials and personalized decision-making. As bandit algorithms are increasingly deployed in these socially sensitive settings, it becomes critical to protect user data privacy and ensure fair treatment across decision rounds. While prior work has independently addressed privacy and fairness in bandit settings, the question of whether both objectives can be achieved simultaneously has remained largely open. Existing privacy-preserving bandit algorithms typically optimize average regret, a utilitarian measure, whereas fairness-aware approaches focus on minimizing Nash regret, which penalizes inequitable reward distributions, but often disregard privacy concerns. To bridge this gap, we introduce Differentially Private Nash Confidence Bound (DP-NCB)-a novel and unified algorithmic framework that simultaneously ensures $\epsilon$-differential privacy and achieves order-optimal Nash regret, matching known lower bounds up to logarithmic factors. The framework is sufficiently general to operate under both global and local differential privacy models, and is anytime, requiring no prior knowledge of the time horizon. We support our theoretical guarantees with simulations on synthetic bandit instances, showing that DP-NCB incurs substantially lower Nash regret than state-of-the-art baselines. Our results offer a principled foundation for designing bandit algorithms that are both privacy-preserving and fair, making them suitable for high-stakes, socially impactful applications.
Related papers
- Locally Differentially Private Thresholding Bandits [3.8916312075738273]
This work investigates the impact of ensuring local differential privacy in the thresholding bandit problem.<n>We propose methods that utilize private responses, obtained through a Bernoulli-based differentially private mechanism, to identify arms with expected rewards exceeding a predefined threshold.
arXiv Detail & Related papers (2025-07-30T20:08:30Z) - Breaking the Gaussian Barrier: Residual-PAC Privacy for Automatic Privatization [25.387857775660855]
We introduce Residual PAC Privacy, an f-divergence-based measure that quantifies the privacy remaining after adversarial inference.<n>We also propose Stackelberg Residual-PAC (SR-PAC) privatization mechanisms for RPAC Privacy, a game-theoretic framework that selects optimal noise distributions.
arXiv Detail & Related papers (2025-06-06T20:52:47Z) - KL-regularization Itself is Differentially Private in Bandits and RLHF [19.463863037999054]
Differential Privacy (DP) provides a rigorous framework for privacy, ensuring the outputs of data-driven algorithms remain statistically indistinguishable across datasets that differ in a single entry.<n>While guaranteeing DP generally requires explicitly injecting noise either to the algorithm itself or to its outputs, the intrinsic randomness of existing algorithms presents an opportunity to achieve DP for free''
arXiv Detail & Related papers (2025-05-23T22:22:02Z) - Convergent Differential Privacy Analysis for General Federated Learning: the $f$-DP Perspective [57.35402286842029]
Federated learning (FL) is an efficient collaborative training paradigm with a focus on local privacy.
differential privacy (DP) is a classical approach to capture and ensure the reliability of private protections.
arXiv Detail & Related papers (2024-08-28T08:22:21Z) - TernaryVote: Differentially Private, Communication Efficient, and
Byzantine Resilient Distributed Optimization on Heterogeneous Data [50.797729676285876]
We propose TernaryVote, which combines a ternary compressor and the majority vote mechanism to realize differential privacy, gradient compression, and Byzantine resilience simultaneously.
We theoretically quantify the privacy guarantee through the lens of the emerging f-differential privacy (DP) and the Byzantine resilience of the proposed algorithm.
arXiv Detail & Related papers (2024-02-16T16:41:14Z) - Concentrated Differential Privacy for Bandits [11.086440815804227]
This paper contributes to the understanding of Differential Privacy (DP) in bandits with a trusted centralised decision-maker.
We propose three private algorithms, namely AdaC-UCB, AdaC-GOPE and AdaC-OFUL, for three bandit settings.
arXiv Detail & Related papers (2023-09-01T16:08:00Z) - Theoretically Principled Federated Learning for Balancing Privacy and
Utility [61.03993520243198]
We propose a general learning framework for the protection mechanisms that protects privacy via distorting model parameters.
It can achieve personalized utility-privacy trade-off for each model parameter, on each client, at each communication round in federated learning.
arXiv Detail & Related papers (2023-05-24T13:44:02Z) - Differentially Private Stochastic Gradient Descent with Low-Noise [49.981789906200035]
Modern machine learning algorithms aim to extract fine-grained information from data to provide accurate predictions, which often conflicts with the goal of privacy protection.
This paper addresses the practical and theoretical importance of developing privacy-preserving machine learning algorithms that ensure good performance while preserving privacy.
arXiv Detail & Related papers (2022-09-09T08:54:13Z) - Decentralized Stochastic Optimization with Inherent Privacy Protection [103.62463469366557]
Decentralized optimization is the basic building block of modern collaborative machine learning, distributed estimation and control, and large-scale sensing.
Since involved data, privacy protection has become an increasingly pressing need in the implementation of decentralized optimization algorithms.
arXiv Detail & Related papers (2022-05-08T14:38:23Z) - Privacy Amplification via Shuffling for Linear Contextual Bandits [51.94904361874446]
We study the contextual linear bandit problem with differential privacy (DP)
We show that it is possible to achieve a privacy/utility trade-off between JDP and LDP by leveraging the shuffle model of privacy.
Our result shows that it is possible to obtain a tradeoff between JDP and LDP by leveraging the shuffle model while preserving local privacy.
arXiv Detail & Related papers (2021-12-11T15:23:28Z) - Generalized Linear Bandits with Local Differential Privacy [4.922800530841394]
Many applications such as personalized medicine and online advertising require the utilization of individual-specific information for effective learning.
This motivates the introduction of local differential privacy (LDP), a stringent notion in privacy, to contextual bandits.
In this paper, we design LDP algorithms for generalized linear bandits to achieve the same regret bound as in non-privacy settings.
arXiv Detail & Related papers (2021-06-07T06:42:00Z) - Private Reinforcement Learning with PAC and Regret Guarantees [69.4202374491817]
We design privacy preserving exploration policies for episodic reinforcement learning (RL)
We first provide a meaningful privacy formulation using the notion of joint differential privacy (JDP)
We then develop a private optimism-based learning algorithm that simultaneously achieves strong PAC and regret bounds, and enjoys a JDP guarantee.
arXiv Detail & Related papers (2020-09-18T20:18:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.