Related papers: When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits

When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits

URL: http://arxiv.org/abs/2209.02570v1
Date: Tue, 6 Sep 2022 15:26:24 GMT
Title: When Privacy Meets Partial Information: A Refined Analysis of Differentially Private Bandits
Authors: Achraf Azize and Debabrota Basu
Abstract summary: We study the problem of multi-armed bandits with $epsilon$-global Differential Privacy (DP) We instantiate $epsilon$-global DP extensions of UCB and KL-UCB algorithms, namely AdaP-UCB and AdaP-KLUCB. AdaP-KLUCB is the first algorithm that both satisfies $epsilon$-global DP and yields a regret upper bound that matches the problem-dependent lower bound up to multiplicative constants.
Score: 4.964737844687583
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We study the problem of multi-armed bandits with $\epsilon$-global Differential Privacy (DP). First, we prove the minimax and problem-dependent regret lower bounds for stochastic and linear bandits that quantify the hardness of bandits with $\epsilon$-global DP. These bounds suggest the existence of two hardness regimes depending on the privacy budget $\epsilon$. In the high-privacy regime (small $\epsilon$), the hardness depends on a coupled effect of privacy and partial information about the reward distributions. In the low-privacy regime (large $\epsilon$), bandits with $\epsilon$-global DP are not harder than the bandits without privacy. For stochastic bandits, we further propose a generic framework to design a near-optimal $\epsilon$ global DP extension of an index-based optimistic bandit algorithm. The framework consists of three ingredients: the Laplace mechanism, arm-dependent adaptive episodes, and usage of only the rewards collected in the last episode for computing private statistics. Specifically, we instantiate $\epsilon$-global DP extensions of UCB and KL-UCB algorithms, namely AdaP-UCB and AdaP-KLUCB. AdaP-KLUCB is the first algorithm that both satisfies $\epsilon$-global DP and yields a regret upper bound that matches the problem-dependent lower bound up to multiplicative constants.

Related papers

Optimal Regret of Bernoulli Bandits under Global Differential Privacy [44.25744563135375]
Regret minimisation in bandits under $epsilon$-global Differential Privacy (DP) has been widely studied.<n>We revisit the regret lower and upper bounds of $epsilon-global DP algorithms for Bernoulli bandits and improve both.
arXiv Detail & Related papers (2025-05-08T19:48:58Z)
Optimized Tradeoffs for Private Prediction with Majority Ensembling [59.99331405291337]
We introduce the Data-dependent Randomized Response Majority (DaRRM) algorithm. DaRRM is parameterized by a data-dependent noise function $gamma$, and enables efficient utility optimization over the class of all private algorithms. We show that DaRRM provably enjoys a privacy gain of a factor of 2 over common baselines, with fixed utility.
arXiv Detail & Related papers (2024-11-27T00:48:48Z)
Differentially Private Best-Arm Identification [14.916947598339988]
Best Arm Identification (BAI) problems are progressively used for data-sensitive applications. Motivated by the data privacy concerns invoked by these applications, we study the problem of BAI with fixed confidence in both the local and central models.
arXiv Detail & Related papers (2024-06-10T16:02:48Z)
Fixed-Budget Differentially Private Best Arm Identification [62.36929749450298]
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints. We derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in $T$.
arXiv Detail & Related papers (2024-01-17T09:23:25Z)
Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed. SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z)
On the Complexity of Differentially Private Best-Arm Identification with Fixed Confidence [16.295693624977563]
We study the problem of Best Arm Identification with fixed confidence under $epsilon$-global Differential Privacy. Our lower bound suggests the existence of two privacy regimes depending on the privacy budget. We propose AdaP-TT, an $epsilon$-global DP variant of the Top Two algorithm.
arXiv Detail & Related papers (2023-09-05T13:07:25Z)
Federated Linear Contextual Bandits with User-level Differential Privacy [8.396384861175799]
This paper studies federated linear contextual bandits under the notion of user-level differential privacy (DP) We first introduce a unified federated bandits framework that can accommodate various definitions of DP in the sequential decision-making setting. We then formally introduce user-level central DP (CDP) and local DP (LDP) in the federated bandits framework, and investigate the fundamental trade-offs between the learning regrets and the corresponding DP guarantees.
arXiv Detail & Related papers (2023-06-08T15:21:47Z)
Normalized/Clipped SGD with Perturbation for Differentially Private Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data. We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z)
Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution. We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z)
Stochastic Linear Bandits Robust to Adversarial Attacks [117.665995707568]
We provide two variants of a Robust Phased Elimination algorithm, one that knows $C$ and one that does not. We show that both variants attain near-optimal regret in the non-corrupted case $C = 0$, while incurring additional additive terms respectively. In a contextual setting, we show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing $C$.
arXiv Detail & Related papers (2020-07-07T09:00:57Z)
Locally Differentially Private (Contextual) Bandits Learning [55.63825598391525]
We study locally differentially private (LDP) bandits learning in this paper. We propose simple black-box reduction frameworks that can solve a large family of context-free bandits learning problems with LDP guarantee.
arXiv Detail & Related papers (2020-06-01T04:02:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.