On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls
- URL: http://arxiv.org/abs/2506.19752v1
- Date: Tue, 24 Jun 2025 16:06:56 GMT
- Title: On the necessity of adaptive regularisation:Optimal anytime online learning on $\boldsymbol{\ell_p}$-balls
- Authors: Emmeran Johnson, David MartÃnez-Rubio, Ciara Pike-Burke, Patrick Rebeschini,
- Abstract summary: We study online convex optimization on $ell_p$-balls in $mathbbRd$ for $p > 2$.<n>While always sub-linear, the optimal regret exhibits a shift between the high-dimensional setting ($d > T$) and the low-dimensional setting ($d leq T$)
- Score: 17.599348264171862
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study online convex optimization on $\ell_p$-balls in $\mathbb{R}^d$ for $p > 2$. While always sub-linear, the optimal regret exhibits a shift between the high-dimensional setting ($d > T$), when the dimension $d$ is greater than the time horizon $T$ and the low-dimensional setting ($d \leq T$). We show that Follow-the-Regularised-Leader (FTRL) with time-varying regularisation which is adaptive to the dimension regime is anytime optimal for all dimension regimes. Motivated by this, we ask whether it is possible to obtain anytime optimality of FTRL with fixed non-adaptive regularisation. Our main result establishes that for separable regularisers, adaptivity in the regulariser is necessary, and that any fixed regulariser will be sub-optimal in one of the two dimension regimes. Finally, we provide lower bounds which rule out sub-linear regret bounds for the linear bandit problem in sufficiently high-dimension for all $\ell_p$-balls with $p \geq 1$.
Related papers
- Regularized Online RLHF with Generalized Bilinear Preferences [68.44113000390544]
We consider the problem of contextual online RLHF with general preferences.<n>We adopt the Generalized Bilinear Preference Model to capture preferences via low-rank, skew-symmetric matrices.<n>We prove that the dual gap of the greedy policy is bounded by the square of the estimation error.
arXiv Detail & Related papers (2026-02-26T15:27:53Z) - Minimax Adaptive Online Nonparametric Regression over Besov Spaces [10.138723409205497]
We study online adversarial regression with convex losses against a rich class of continuous yet highly irregular prediction rules.<n>We introduce an adaptive wavelet-based algorithm that performs sequential prediction without prior knowledge of $(s,p,q)$.<n>We also design a locally adaptive extension capable of dynamically tracking spatially inhomogeneous smoothness.
arXiv Detail & Related papers (2025-05-26T09:23:11Z) - No-Regret Linear Bandits under Gap-Adjusted Misspecification [38.592043705502725]
Existing linear bandits work usually relies on a uniform misspecification parameter $epsilon$ that measures the sup-norm error of the best linear approximation.<n>We propose a more natural model of misspecification which only requires the approximation error at each input $x$ to be proportional to the suboptimality gap at $x$.<n>We show that the classical LinUCB algorithm -- designed for the realizable case -- is automatically robust against such $rho$-gap-adjusted misspecification.
arXiv Detail & Related papers (2025-01-09T16:44:53Z) - Non-Euclidean High-Order Smooth Convex Optimization [9.940728137241214]
We develop algorithms for the optimization of convex objectives that have H"older continuous $q$-th derivatives.<n>Our algorithms work for general norms under mild conditions, including the $ell_p$-settings for $1leq pleq infty$.
arXiv Detail & Related papers (2024-11-13T19:22:34Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Improved Dynamic Regret for Online Frank-Wolfe [54.690867216880356]
We investigate the dynamic regret of online Frank-Wolfe (OFW), which is an efficient projection-free algorithm for online convex optimization.
In this paper, we derive improved dynamic regret bounds for OFW by extending the fast convergence rates of FW from offline optimization to online optimization.
arXiv Detail & Related papers (2023-02-11T07:19:51Z) - Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems.
We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Private Convex Optimization in General Norms [23.166642097170545]
We propose a new framework for differentially private optimization of convex functions which are Lipschitz in an arbitrary norm $normxcdot$.
We show that this mechanism satisfies differential privacy and solves both DP-ERM (empirical risk minimization) and DP-SCO (stochastic convex optimization)
Our framework is the first to apply to private convex optimization in general normed spaces, and directly recovers non-private SCO rates achieved by mirror descent.
arXiv Detail & Related papers (2022-07-18T02:02:22Z) - A Unified Analysis Method for Online Optimization in Normed Vector Space [3.615256443481602]
We present a unified analysis method that relies on the generalized cosine rule for online optimization in normed vector space.
We extend online convex to online monotone optimization, by replacing losses online game with monotone operators.
arXiv Detail & Related papers (2021-12-22T18:48:19Z) - Non-Euclidean Differentially Private Stochastic Convex Optimization [15.302167005107135]
We show that noisy gradient descent (SGD) algorithms attain the optimal excess risk in low-dimensional regimes.
Our work draws upon concepts from the geometry of normed spaces, such as the notions of regularity, uniform convexity, and uniform smoothness.
arXiv Detail & Related papers (2021-03-01T19:48:44Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Naive Exploration is Optimal for Online LQR [49.681825576239355]
We show that the optimal regret scales as $widetildeTheta(sqrtd_mathbfu2 d_mathbfx T)$, where $T$ is the number of time steps, $d_mathbfu$ is the dimension of the input space, and $d_mathbfx$ is the dimension of the system state.
Our lower bounds rule out the possibility of a $mathrmpoly(logT)$-regret algorithm, which had been
arXiv Detail & Related papers (2020-01-27T03:44:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.