Tracking the Best Expert Privately
- URL: http://arxiv.org/abs/2503.09889v1
- Date: Wed, 12 Mar 2025 23:17:03 GMT
- Title: Tracking the Best Expert Privately
- Authors: Aadirupa Saha, Vinod Raman, Hilal Asi,
- Abstract summary: We design differentially private algorithms for the problem of prediction with expert advice under dynamic regret, also known as the best expert.<n>Our work addresses three natural types of adversaries, with shifting distributions, oblivious, and adaptive, and designs algorithms with sublinear regret for all three cases.
- Score: 31.03746573619407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We design differentially private algorithms for the problem of prediction with expert advice under dynamic regret, also known as tracking the best expert. Our work addresses three natural types of adversaries, stochastic with shifting distributions, oblivious, and adaptive, and designs algorithms with sub-linear regret for all three cases. In particular, under a shifting stochastic adversary where the distribution may shift $S$ times, we provide an $\epsilon$-differentially private algorithm whose expected dynamic regret is at most $O\left( \sqrt{S T \log (NT)} + \frac{S \log (NT)}{\epsilon}\right)$, where $T$ and $N$ are the epsilon horizon and number of experts, respectively. For oblivious adversaries, we give a reduction from dynamic regret minimization to static regret minimization, resulting in an upper bound of $O\left(\sqrt{S T \log(NT)} + \frac{S T^{1/3}\log(T/\delta) \log(NT)}{\epsilon^{2/3}}\right)$ on the expected dynamic regret, where $S$ now denotes the allowable number of switches of the best expert. Finally, similar to static regret, we establish a fundamental separation between oblivious and adaptive adversaries for the dynamic setting: while our algorithms show that sub-linear regret is achievable for oblivious adversaries in the high-privacy regime $\epsilon \le \sqrt{S/T}$, we show that any $(\epsilon, \delta)$-differentially private algorithm must suffer linear dynamic regret under adaptive adversaries for $\epsilon \le \sqrt{S/T}$. Finally, to complement this lower bound, we give an $\epsilon$-differentially private algorithm that attains sub-linear dynamic regret under adaptive adversaries whenever $\epsilon \gg \sqrt{S/T}$.
Related papers
- Private Online Learning via Lazy Algorithms [52.772510023828744]
We study the problem of private online learning, specifically, online prediction from experts (OPE) and online convex optimization (OCO)
We propose a new transformation that transforms lazy online learning algorithms into private algorithms.
arXiv Detail & Related papers (2024-06-05T20:43:05Z) - Near-Optimal Algorithms for Private Online Optimization in the
Realizable Regime [74.52487417350221]
We consider online learning problems in the realizable setting, where there is a zero-loss solution.
We propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds.
arXiv Detail & Related papers (2023-02-27T21:19:24Z) - Private Online Prediction from Experts: Separations and Faster Rates [74.52487417350221]
Online prediction from experts is a fundamental problem in machine learning and several works have studied this problem under privacy constraints.
We propose and analyze new algorithms for this problem that improve over the regret bounds of the best existing algorithms for non-adaptive adversaries.
arXiv Detail & Related papers (2022-10-24T18:40:19Z) - Differentially Private Stochastic Linear Bandits: (Almost) for Free [17.711701190680742]
In the central model, we achieve almost the same regret as the optimal non-private algorithms, which means we get privacy for free.
In the shuffled model, we also achieve regret of $tildeO(sqrtT+frac1epsilon)$ %for small $epsilon$ as in the central case, while the best previously known algorithm suffers a regret of $tildeO(frac1epsilonT3/5)$.
arXiv Detail & Related papers (2022-07-07T17:20:57Z) - Optimal Dynamic Regret in Proper Online Learning with Strongly Convex
Losses and Beyond [23.91519151164528]
We show that in a proper learning setup, Strongly Adaptive algorithms can achieve the near optimal dynamic regret.
We also derive near optimal dynamic regret rates for the special case of proper online learning with exp-concave losses.
arXiv Detail & Related papers (2022-01-21T22:08:07Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Near-Optimal Algorithms for Differentially Private Online Learning in a Stochastic Environment [7.4288915613206505]
We study differentially private online learning problems in a environment under both bandit and full information feedback.
For differentially private bandits, we propose both UCB and Thompson Sampling-based algorithms that are anytime and achieve the optimal $O left(sum_j: Delta_j>0 fracln(T)min leftDelta_j, epsilon right right)$ minimax lower bound.
For the same differentially private full information setting, we also present an $epsilon$-differentially
arXiv Detail & Related papers (2021-02-16T02:48:16Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Taking a hint: How to leverage loss predictors in contextual bandits? [63.546913998407405]
We study learning in contextual bandits with the help of loss predictors.
We show that the optimal regret is $mathcalO(minsqrtT, sqrtmathcalETfrac13)$ when $mathcalE$ is known.
arXiv Detail & Related papers (2020-03-04T07:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.