Related papers: Logistic Regression Regret: What's the Catch?

Logistic Regression Regret: What's the Catch?

URL: http://arxiv.org/abs/2002.02950v2
Date: Wed, 19 Feb 2020 15:27:53 GMT
Title: Logistic Regression Regret: What's the Catch?
Authors: Gil I. Shamir
Abstract summary: We derive lower bounds with logarithmic regret under $L_infty$ constraints on the parameters. For $L$ constraints, it is shown that for large enough $d$, the regret remains linear in $d$ but no longer logarithmic in $T$.
Score: 3.7311680121118345
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We address the problem of the achievable regret rates with online logistic regression. We derive lower bounds with logarithmic regret under $L_1$, $L_2$, and $L_\infty$ constraints on the parameter values. The bounds are dominated by $d/2 \log T$, where $T$ is the horizon and $d$ is the dimensionality of the parameter space. We show their achievability for $d=o(T^{1/3})$ in all these cases with Bayesian methods, that achieve them up to a $d/2 \log d$ term. Interesting different behaviors are shown for larger dimensionality. Specifically, on the negative side, if $d = \Omega(\sqrt{T})$, any algorithm is guaranteed regret of $\Omega(d \log T)$ (greater than $\Omega(\sqrt{T})$) under $L_\infty$ constraints on the parameters (and the example features). On the positive side, under $L_1$ constraints on the parameters, there exist algorithms that can achieve regret that is sub-linear in $d$ for the asymptotically larger values of $d$. For $L_2$ constraints, it is shown that for large enough $d$, the regret remains linear in $d$ but no longer logarithmic in $T$. Adapting the redundancy-capacity theorem from information theory, we demonstrate a principled methodology based on grids of parameters to derive lower bounds. Grids are also utilized to derive some upper bounds. Our results strengthen results by Kakade and Ng (2005) and Foster et al. (2018) for upper bounds for this problem, introduce novel lower bounds, and adapt a methodology that can be used to obtain such bounds for other related problems. They also give a novel characterization of the asymptotic behavior when the dimension of the parameter space is allowed to grow with $T$. They additionally establish connections to the information theory literature, demonstrating that the actual regret for logistic regression depends on the richness of the parameter class, where even within this problem, richer classes lead to greater regret.

Related papers

Online Inverse Linear Optimization: Improved Regret Bound, Robustness to Suboptimality, and Toward Tight Regret Analysis [25.50155563108198]
We study an online learning problem where, over $T$ rounds, a learner observes both time-varying sets of feasible actions and an agent's optimal actions. We obtain an $O(nln T)$ regret bound, improving upon the previous bound of $O(n4ln T)$ by a factor of $n3$.
arXiv Detail & Related papers (2025-01-24T09:19:15Z)
Reinforcement Learning and Regret Bounds for Admission Control [0.08192907805418585]
The expected regret of any reinforcement learning algorithm is lower bounded by $Omegaleft(sqrtDXATright)$ for undiscounted returns. We propose an algorithm inspired by UCRL2, and use the structure of the problem to upper bound the expected total regret by $O(Slog T + sqrtmT log T)$ in the finite server case.
arXiv Detail & Related papers (2024-06-07T09:09:14Z)
Batched Stochastic Bandit for Nondegenerate Functions [8.015503209312786]
This paper studies batched bandit learning problems for nondegenerate functions. We introduce an algorithm that solves the batched bandit problem for nondegenerate functions near-optimally.
arXiv Detail & Related papers (2024-05-09T12:50:16Z)
LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits [38.41164102066483]
This study considers the linear contextual bandit problem with independent and identically distributed contexts. Our proposed algorithm is based on the Follow-The-Regularized-Leader with the Tsallis entropy and referred to as the $alpha$-textual-Con (LC)-Tsallis-INF.
arXiv Detail & Related papers (2024-03-05T18:59:47Z)
Optimal Sketching Bounds for Sparse Linear Regression [116.30196615349226]
We study oblivious sketching for $k$-sparse linear regression under various loss functions such as an $ell_p$ norm, or from a broad class of hinge-like loss functions. We show that for sparse $ell$vareps regression, there is an oblivious distribution over sketches with $Theta(klog(d/k)/varepsilon2)$ rows, which is tight up to a constant factor. We also show that sketching $O(mu2 klog(mu n d/varepsilon)/var
arXiv Detail & Related papers (2023-04-05T07:24:19Z)
Horizon-Free and Variance-Dependent Reinforcement Learning for Latent Markov Decision Processes [62.90204655228324]
We study regret minimization for reinforcement learning (RL) in Latent Markov Decision Processes (LMDPs) with context in hindsight. We design a novel model-based algorithmic framework which can be instantiated with both a model-optimistic and a value-optimistic solver.
arXiv Detail & Related papers (2022-10-20T21:32:01Z)
On learning Whittle index policy for restless bandits with scalable regret [1.0152838128195465]
Reinforcement learning is an attractive approach to learn good resource allocation and scheduling policies. The cumulative regret of most RL algorithms scales as $tilde O(mathsfS sqrtmathsfA T)$, where $mathsfS$ is the size of the state space. We present a model-based RL algorithm for such problems which has scalable regret.
arXiv Detail & Related papers (2022-02-07T19:07:02Z)
Minimal Expected Regret in Linear Quadratic Control [79.81807680370677]
We devise an online learning algorithm and provide guarantees on its expected regret. This regret at time $T$ is upper bounded (i) by $widetildeO((d_u+d_x)sqrtd_xT)$ when $A$ and $B$ are unknown.
arXiv Detail & Related papers (2021-09-29T14:07:21Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Logarithmic Regret for Reinforcement Learning with Linear Function Approximation [99.59319332864129]
We show that logarithmic regret is attainable under two recently proposed linear MDP assumptions. To the best of our knowledge, these are the first logarithmic regret bounds for RL with linear function approximation.
arXiv Detail & Related papers (2020-11-23T17:25:00Z)
$Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z)
Naive Exploration is Optimal for Online LQR [49.681825576239355]
We show that the optimal regret scales as $widetildeTheta(sqrtd_mathbfu2 d_mathbfx T)$, where $T$ is the number of time steps, $d_mathbfu$ is the dimension of the input space, and $d_mathbfx$ is the dimension of the system state. Our lower bounds rule out the possibility of a $mathrmpoly(logT)$-regret algorithm, which had been
arXiv Detail & Related papers (2020-01-27T03:44:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.