Smoothed Analysis with Adaptive Adversaries
- URL: http://arxiv.org/abs/2102.08446v1
- Date: Tue, 16 Feb 2021 20:54:49 GMT
- Title: Smoothed Analysis with Adaptive Adversaries
- Authors: Nika Haghtalab, Tim Roughgarden, Abhishek Shetty
- Abstract summary: We prove novel algorithmic guarantees for several online problems in the smoothed analysis model.
In this model, at each time an adversary chooses an input distribution with density function bounded above by $tfrac1sigma$ times that of the uniform distribution.
Our results hold for em adaptive adversaries that can choose an input distribution based on the decisions of the algorithm and the realizations of the inputs in the previous time steps.
- Score: 28.940767013349408
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We prove novel algorithmic guarantees for several online problems in the
smoothed analysis model. In this model, at each time an adversary chooses an
input distribution with density function bounded above by $\tfrac{1}{\sigma}$
times that of the uniform distribution; nature then samples an input from this
distribution. Crucially, our results hold for {\em adaptive} adversaries that
can choose an input distribution based on the decisions of the algorithm and
the realizations of the inputs in the previous time steps.
This paper presents a general technique for proving smoothed algorithmic
guarantees against adaptive adversaries, in effect reducing the setting of
adaptive adversaries to the simpler case of oblivious adversaries. We apply
this technique to prove strong smoothed guarantees for three problems:
-Online learning: We consider the online prediction problem, where instances
are generated from an adaptive sequence of $\sigma$-smooth distributions and
the hypothesis class has VC dimension $d$. We bound the regret by
$\tilde{O}\big(\sqrt{T d\ln(1/\sigma)} + d\sqrt{\ln(T/\sigma)}\big)$. This
answers open questions of [RST11,Hag18].
-Online discrepancy minimization: We consider the online Koml\'os problem,
where the input is generated from an adaptive sequence of $\sigma$-smooth and
isotropic distributions on the $\ell_2$ unit ball. We bound the $\ell_\infty$
norm of the discrepancy vector by $\tilde{O}\big(\ln^2\!\big(
\frac{nT}{\sigma}\big) \big)$.
-Dispersion in online optimization: We consider online optimization of
piecewise Lipschitz functions where functions with $\ell$ discontinuities are
chosen by a smoothed adaptive adversary and show that the resulting sequence is
$\big( {\sigma}/{\sqrt{T\ell}}, \tilde O\big(\sqrt{T\ell}
\big)\big)$-dispersed. This matches the parameters of [BDV18] for oblivious
adversaries, up to log factors.
Related papers
- Mirror Descent Algorithms with Nearly Dimension-Independent Rates for
Differentially-Private Stochastic Saddle-Point Problems [6.431793114484429]
We propose $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ to solve the problem of differentially-private saddle-points in the polyhedral setting.
We show that our algorithms attain a rate of $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ with constant success.
arXiv Detail & Related papers (2024-03-05T12:28:00Z) - Asymptotically Optimal Pure Exploration for Infinite-Armed Bandits [4.811176167998627]
We study pure exploration with infinitely many bandit arms generated i.i.d. from an unknown distribution.
Our goal is to efficiently select a single high quality arm whose average reward is, with probability $1-delta$, within $varepsilon$ of being among the top $eta$-fraction of arms.
arXiv Detail & Related papers (2023-06-03T04:00:47Z) - Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs [72.40181882916089]
We show that our algorithm achieves an $tildeObig((d+log (|mathcalS|2 |mathcalA|))sqrtKbig)$ regret with full-information feedback, where $d$ is the dimension of a known feature mapping linearly parametrizing the unknown transition kernel of the MDP, $K$ is the number of episodes, $|mathcalS|$ and $|mathcalA|$ are the cardinalities of the state and action spaces
arXiv Detail & Related papers (2023-05-15T05:37:32Z) - Optimistic Online Mirror Descent for Bridging Stochastic and Adversarial Online Convex Optimization [40.19608203064051]
We investigate theoretical guarantees of optimistic online mirror descent (OMD) for the SEA model.
For strongly convex and smooth functions, we establish an $mathcalO(sigma_max2 + Sigma_max2) log, better than their $mathcalO(sigma_max2 + Sigma_max2) log T.
We establish the first dynamic regret guarantee for the model with convex and smooth functions, which is more favorable than static regret bounds in non-stationary scenarios
arXiv Detail & Related papers (2023-02-09T10:42:11Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - (Nearly) Optimal Private Linear Regression via Adaptive Clipping [22.639650869444395]
We study the problem of differentially private linear regression where each data point is sampled from a fixed sub-Gaussian style distribution.
We propose and analyze a one-pass mini-batch gradient descent method (DP-AMBSSGD) where points in each iteration are sampled without replacement.
arXiv Detail & Related papers (2022-07-11T08:04:46Z) - Computationally Efficient Horizon-Free Reinforcement Learning for Linear
Mixture MDPs [111.75736569611159]
We propose the first computationally efficient horizon-free algorithm for linear mixture MDPs.
Our algorithm adapts a weighted least square estimator for the unknown transitional dynamic.
This also improves upon the best-known algorithms in this setting when $sigma_k2$'s are known.
arXiv Detail & Related papers (2022-05-23T17:59:18Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Variance-Aware Confidence Set: Variance-Dependent Bound for Linear
Bandits and Horizon-Free Bound for Linear Mixture MDP [76.94328400919836]
We show how to construct variance-aware confidence sets for linear bandits and linear mixture Decision Process (MDP)
For linear bandits, we obtain an $widetildeO(mathrmpoly(d)sqrt1 + sum_i=1Ksigma_i2) regret bound, where $d is the feature dimension.
For linear mixture MDP, we obtain an $widetildeO(mathrmpoly(d)sqrtK)$ regret bound, where
arXiv Detail & Related papers (2021-01-29T18:57:52Z) - Explicit Best Arm Identification in Linear Bandits Using No-Regret
Learners [17.224805430291177]
We study the problem of best arm identification in linearly parameterised multi-armed bandits.
We propose an explicitly implementable and provably order-optimal sample-complexity algorithm to solve this problem.
arXiv Detail & Related papers (2020-06-13T05:00:01Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.