Fixed-Budget Differentially Private Best Arm Identification
- URL: http://arxiv.org/abs/2401.09073v1
- Date: Wed, 17 Jan 2024 09:23:25 GMT
- Title: Fixed-Budget Differentially Private Best Arm Identification
- Authors: Zhirui Chen, P. N. Karthik, Yeow Meng Chee, and Vincent Y. F. Tan
- Abstract summary: We study best arm identification (BAI) in linear bandits in the fixed-budget regime under differential privacy constraints.
We derive a minimax lower bound on the error probability, and demonstrate that the lower and the upper bounds decay exponentially in $T$.
- Score: 62.36929749450298
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study best arm identification (BAI) in linear bandits in the fixed-budget
regime under differential privacy constraints, when the arm rewards are
supported on the unit interval. Given a finite budget $T$ and a privacy
parameter $\varepsilon>0$, the goal is to minimise the error probability in
finding the arm with the largest mean after $T$ sampling rounds, subject to the
constraint that the policy of the decision maker satisfies a certain {\em
$\varepsilon$-differential privacy} ($\varepsilon$-DP) constraint. We construct
a policy satisfying the $\varepsilon$-DP constraint (called {\sc DP-BAI}) by
proposing the principle of {\em maximum absolute determinants}, and derive an
upper bound on its error probability. Furthermore, we derive a minimax lower
bound on the error probability, and demonstrate that the lower and the upper
bounds decay exponentially in $T$, with exponents in the two bounds matching
order-wise in (a) the sub-optimality gaps of the arms, (b) $\varepsilon$, and
(c) the problem complexity that is expressible as the sum of two terms, one
characterising the complexity of standard fixed-budget BAI (without privacy
constraints), and the other accounting for the $\varepsilon$-DP constraint.
Additionally, we present some auxiliary results that contribute to the
derivation of the lower bound on the error probability. These results, we
posit, may be of independent interest and could prove instrumental in proving
lower bounds on error probabilities in several other bandit problems. Whereas
prior works provide results for BAI in the fixed-budget regime without privacy
constraints or in the fixed-confidence regime with privacy constraints, our
work fills the gap in the literature by providing the results for BAI in the
fixed-budget regime under the $\varepsilon$-DP constraint.
Related papers
- Best Arm Identification with Minimal Regret [55.831935724659175]
Best arm identification problem elegantly amalgamates regret minimization and BAI.
Agent's goal is to identify the best arm with a prescribed confidence level.
Double KL-UCB algorithm achieves optimality as the confidence level tends to zero.
arXiv Detail & Related papers (2024-09-27T16:46:02Z) - Differentially Private Best-Arm Identification [14.916947598339988]
Best Arm Identification (BAI) problems are progressively used for data-sensitive applications.
Motivated by the data privacy concerns invoked by these applications, we study the problem of BAI with fixed confidence in both the local and central models.
arXiv Detail & Related papers (2024-06-10T16:02:48Z) - On the Complexity of Differentially Private Best-Arm Identification with
Fixed Confidence [16.295693624977563]
We study the problem of Best Arm Identification with fixed confidence under $epsilon$-global Differential Privacy.
Our lower bound suggests the existence of two privacy regimes depending on the privacy budget.
We propose AdaP-TT, an $epsilon$-global DP variant of the Top Two algorithm.
arXiv Detail & Related papers (2023-09-05T13:07:25Z) - Federated Linear Contextual Bandits with User-level Differential Privacy [8.396384861175799]
This paper studies federated linear contextual bandits under the notion of user-level differential privacy (DP)
We first introduce a unified federated bandits framework that can accommodate various definitions of DP in the sequential decision-making setting.
We then formally introduce user-level central DP (CDP) and local DP (LDP) in the federated bandits framework, and investigate the fundamental trade-offs between the learning regrets and the corresponding DP guarantees.
arXiv Detail & Related papers (2023-06-08T15:21:47Z) - When Privacy Meets Partial Information: A Refined Analysis of
Differentially Private Bandits [4.964737844687583]
We study the problem of multi-armed bandits with $epsilon$-global Differential Privacy (DP)
We instantiate $epsilon$-global DP extensions of UCB and KL-UCB algorithms, namely AdaP-UCB and AdaP-KLUCB.
AdaP-KLUCB is the first algorithm that both satisfies $epsilon$-global DP and yields a regret upper bound that matches the problem-dependent lower bound up to multiplicative constants.
arXiv Detail & Related papers (2022-09-06T15:26:24Z) - Batch-Size Independent Regret Bounds for Combinatorial Semi-Bandits with
Probabilistically Triggered Arms or Independent Arms [53.89752069984382]
We study the semi-bandits (CMAB) and focus on reducing the dependency of the batch-size $K$ in the regret bound.
First, for the setting of CMAB with probabilistically triggered arms (CMAB-T), we propose a BCUCB-T algorithm with variance-aware confidence intervals.
Second, for the setting of non-triggering CMAB with independent arms, we propose a SESCB algorithm which leverages on the non-triggering version of the TPVM condition.
arXiv Detail & Related papers (2022-08-31T13:09:39Z) - Connect the Dots: Tighter Discrete Approximations of Privacy Loss
Distributions [49.726408540784334]
Key question in PLD-based accounting is how to approximate any (potentially continuous) PLD with a PLD over any specified discrete support.
We show that our pessimistic estimate is the best possible among all pessimistic estimates.
arXiv Detail & Related papers (2022-07-10T04:25:02Z) - New Lower Bounds for Private Estimation and a Generalized Fingerprinting
Lemma [10.176795938619417]
We prove new lower bounds for statistical estimation tasks under the constraint of $(varepsilon, delta)$-differential privacy.
We show that estimating the Frobenius norm requires $Omega(d2)$ samples, and in spectral norm requires $Omega(d3/2)$ samples, both matching upper bounds up to logarithmic factors.
arXiv Detail & Related papers (2022-05-17T17:55:10Z) - Problem Dependent View on Structured Thresholding Bandit Problems [73.70176003598449]
We investigate the problem dependent regime in the Thresholding Bandit problem (TBP)
The objective of the learner is to output, at the end of a sequential game, the set of arms whose means are above a given threshold.
We provide upper and lower bounds for the probability of error in both the concave and monotone settings.
arXiv Detail & Related papers (2021-06-18T15:01:01Z) - On Lower Bounds for Standard and Robust Gaussian Process Bandit
Optimization [55.937424268654645]
We consider algorithm-independent lower bounds for the problem of black-box optimization of functions having a bounded norm.
We provide a novel proof technique for deriving lower bounds on the regret, with benefits including simplicity, versatility, and an improved dependence on the error probability.
arXiv Detail & Related papers (2020-08-20T03:48:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.