When Privacy Meets Partial Information: A Refined Analysis of
Differentially Private Bandits
- URL: http://arxiv.org/abs/2209.02570v1
- Date: Tue, 6 Sep 2022 15:26:24 GMT
- Title: When Privacy Meets Partial Information: A Refined Analysis of
Differentially Private Bandits
- Authors: Achraf Azize and Debabrota Basu
- Abstract summary: We study the problem of multi-armed bandits with $epsilon$-global Differential Privacy (DP)
We instantiate $epsilon$-global DP extensions of UCB and KL-UCB algorithms, namely AdaP-UCB and AdaP-KLUCB.
AdaP-KLUCB is the first algorithm that both satisfies $epsilon$-global DP and yields a regret upper bound that matches the problem-dependent lower bound up to multiplicative constants.
- Score: 4.964737844687583
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the problem of multi-armed bandits with $\epsilon$-global
Differential Privacy (DP). First, we prove the minimax and problem-dependent
regret lower bounds for stochastic and linear bandits that quantify the
hardness of bandits with $\epsilon$-global DP. These bounds suggest the
existence of two hardness regimes depending on the privacy budget $\epsilon$.
In the high-privacy regime (small $\epsilon$), the hardness depends on a
coupled effect of privacy and partial information about the reward
distributions. In the low-privacy regime (large $\epsilon$), bandits with
$\epsilon$-global DP are not harder than the bandits without privacy. For
stochastic bandits, we further propose a generic framework to design a
near-optimal $\epsilon$ global DP extension of an index-based optimistic bandit
algorithm. The framework consists of three ingredients: the Laplace mechanism,
arm-dependent adaptive episodes, and usage of only the rewards collected in the
last episode for computing private statistics. Specifically, we instantiate
$\epsilon$-global DP extensions of UCB and KL-UCB algorithms, namely AdaP-UCB
and AdaP-KLUCB. AdaP-KLUCB is the first algorithm that both satisfies
$\epsilon$-global DP and yields a regret upper bound that matches the
problem-dependent lower bound up to multiplicative constants.
Related papers
- Differentially Private Best-Arm Identification [14.916947598339988]
Best Arm Identification (BAI) problems are progressively used for data-sensitive applications.
Motivated by the data privacy concerns invoked by these applications, we study the problem of BAI with fixed confidence in both the local and central models.
arXiv Detail & Related papers (2024-06-10T16:02:48Z) - Fixed-Budget Differentially Private Best Arm Identification [62.36929749450298]
We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints.
We derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in $T$.
arXiv Detail & Related papers (2024-01-17T09:23:25Z) - Combinatorial Stochastic-Greedy Bandit [79.1700188160944]
We propose a novelgreedy bandit (SGB) algorithm for multi-armed bandit problems when no extra information other than the joint reward of the selected set of $n$ arms at each time $tin [T]$ is observed.
SGB adopts an optimized-explore-then-commit approach and is specifically designed for scenarios with a large set of base arms.
arXiv Detail & Related papers (2023-12-13T11:08:25Z) - On the Complexity of Differentially Private Best-Arm Identification with
Fixed Confidence [16.295693624977563]
We study the problem of Best Arm Identification with fixed confidence under $epsilon$-global Differential Privacy.
Our lower bound suggests the existence of two privacy regimes depending on the privacy budget.
We propose AdaP-TT, an $epsilon$-global DP variant of the Top Two algorithm.
arXiv Detail & Related papers (2023-09-05T13:07:25Z) - Federated Linear Contextual Bandits with User-level Differential Privacy [8.396384861175799]
This paper studies federated linear contextual bandits under the notion of user-level differential privacy (DP)
We first introduce a unified federated bandits framework that can accommodate various definitions of DP in the sequential decision-making setting.
We then formally introduce user-level central DP (CDP) and local DP (LDP) in the federated bandits framework, and investigate the fundamental trade-offs between the learning regrets and the corresponding DP guarantees.
arXiv Detail & Related papers (2023-06-08T15:21:47Z) - Normalized/Clipped SGD with Perturbation for Differentially Private
Non-Convex Optimization [94.06564567766475]
DP-SGD and DP-NSGD mitigate the risk of large models memorizing sensitive training data.
We show that these two algorithms achieve similar best accuracy while DP-NSGD is comparatively easier to tune than DP-SGD.
arXiv Detail & Related papers (2022-06-27T03:45:02Z) - Learning with User-Level Privacy [61.62978104304273]
We analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints.
Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution.
We derive an algorithm that privately answers a sequence of $K$ adaptively chosen queries with privacy cost proportional to $tau$, and apply it to solve the learning tasks we consider.
arXiv Detail & Related papers (2021-02-23T18:25:13Z) - Stochastic Linear Bandits Robust to Adversarial Attacks [117.665995707568]
We provide two variants of a Robust Phased Elimination algorithm, one that knows $C$ and one that does not.
We show that both variants attain near-optimal regret in the non-corrupted case $C = 0$, while incurring additional additive terms respectively.
In a contextual setting, we show that a simple greedy algorithm is provably robust with a near-optimal additive regret term, despite performing no explicit exploration and not knowing $C$.
arXiv Detail & Related papers (2020-07-07T09:00:57Z) - Locally Differentially Private (Contextual) Bandits Learning [55.63825598391525]
We study locally differentially private (LDP) bandits learning in this paper.
We propose simple black-box reduction frameworks that can solve a large family of context-free bandits learning problems with LDP guarantee.
arXiv Detail & Related papers (2020-06-01T04:02:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.