Cascading Bandit under Differential Privacy
- URL: http://arxiv.org/abs/2105.11126v1
- Date: Mon, 24 May 2021 07:19:01 GMT
- Title: Cascading Bandit under Differential Privacy
- Authors: Kun Wang, Jing Dong, Baoxiang Wang, Shuai Li, Shuo Shao
- Abstract summary: We study emphdifferential privacy (DP) and emphlocal differential privacy (LDP) in cascading bandits.
Under DP, we propose an algorithm which guarantees $epsilon$-indistinguishability and a regret of $mathcalO(fraclog3 Tepsilon)$ for an arbitrarily small $xi$.
Under ($epsilon$,$delta$)-LDP, we relax the $K2$ dependence through the tradeoff between privacy budgetepsilon$ and error probability $
- Score: 21.936577816668944
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper studies \emph{differential privacy (DP)} and \emph{local
differential privacy (LDP)} in cascading bandits. Under DP, we propose an
algorithm which guarantees $\epsilon$-indistinguishability and a regret of
$\mathcal{O}((\frac{\log T}{\epsilon})^{1+\xi})$ for an arbitrarily small
$\xi$. This is a significant improvement from the previous work of
$\mathcal{O}(\frac{\log^3 T}{\epsilon})$ regret. Under
($\epsilon$,$\delta$)-LDP, we relax the $K^2$ dependence through the tradeoff
between privacy budget $\epsilon$ and error probability $\delta$, and obtain a
regret of $\mathcal{O}(\frac{K\log (1/\delta) \log T}{\epsilon^2})$, where $K$
is the size of the arm subset. This result holds for both Gaussian mechanism
and Laplace mechanism by analyses on the composition. Our results extend to
combinatorial semi-bandit. We show respective lower bounds for DP and LDP
cascading bandits. Extensive experiments corroborate our theoretic findings.
Related papers
- Fast Rates for Bandit PAC Multiclass Classification [73.17969992976501]
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct.
Our main contribution is in designing a novel learning algorithm for the agnostic $(varepsilon,delta)$PAC version of the problem.
arXiv Detail & Related papers (2024-06-18T08:54:04Z) - Sample-Optimal Locally Private Hypothesis Selection and the Provable
Benefits of Interactivity [8.100854060749212]
We study the problem of hypothesis selection under the constraint of local differential privacy.
We devise an $varepsilon$-locally-differentially-private ($varepsilon$-LDP) algorithm that uses $Thetaleft(fracklog kalpha2min varepsilon2,1 right)$ to guarantee that $d_TV(h,hatf)leq alpha + 9 min_fin mathcalF
arXiv Detail & Related papers (2023-12-09T19:22:10Z) - Near Sample-Optimal Reduction-based Policy Learning for Average Reward
MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP)
We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z) - Private Convex Optimization via Exponential Mechanism [16.867534746193833]
We show that modifying the exponential mechanism by adding an $ellcave2$ regularizer to $F(x)$ recovers both the known optimal empirical risk and population loss under $(epsilon,delta)$-DP.
We also show how to implement this mechanism using $widetildeO(n min(d, n)) queries to $f_i(x) for the DP-SCO where $n$ is the number of samples/optimal and $d is the ambient dimension.
arXiv Detail & Related papers (2022-03-01T06:51:03Z) - Faster Rates of Differentially Private Stochastic Convex Optimization [7.93728520583825]
We study the case where the population risk function satisfies the Tysbakov Noise Condition (TNC) with some parameter $theta>1$.
In the second part, we focus on a special case where the population risk function is strongly convex.
arXiv Detail & Related papers (2021-07-31T22:23:39Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Fast Dimension Independent Private AdaGrad on Publicly Estimated
Subspaces [24.52697154861819]
We show that noisy AdaGrad achieves a regret comparable to traditional AdaGrad plus a well-controlled term due to noise.
We operate with general convex functions in both constrained and unconstrained minimization.
arXiv Detail & Related papers (2020-08-14T20:46:38Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - (Locally) Differentially Private Combinatorial Semi-Bandits [26.462686161971803]
We study Combinatorial Semi-Bandits (CSB) that is an extension of classic Multi-Armed Bandits (MAB) under Differential Privacy (DP) and stronger Local Differential Privacy (LDP) setting.
arXiv Detail & Related papers (2020-06-01T04:23:27Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.