Exponential Weights Algorithms for Selective Learning
- URL: http://arxiv.org/abs/2106.15662v1
- Date: Tue, 29 Jun 2021 18:14:01 GMT
- Title: Exponential Weights Algorithms for Selective Learning
- Authors: Mingda Qiao, Gregory Valiant
- Abstract summary: We study the selective learning problem introduced by Qiao and Valiant, in which the learner observes $n$ labeled data points one at a time.
The excess risk incurred by the learner is defined as the difference between the average loss of $hatell over those $w$ data points.
We also study a more restrictive family of learning algorithms that are bounded-recall in that when a prediction window of length $w$ is chosen, the learner's decision only depends on the most recent $w$ data points.
- Score: 39.334633118161285
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the selective learning problem introduced by Qiao and Valiant
(2019), in which the learner observes $n$ labeled data points one at a time. At
a time of its choosing, the learner selects a window length $w$ and a model
$\hat\ell$ from the model class $\mathcal{L}$, and then labels the next $w$
data points using $\hat\ell$. The excess risk incurred by the learner is
defined as the difference between the average loss of $\hat\ell$ over those $w$
data points and the smallest possible average loss among all models in
$\mathcal{L}$ over those $w$ data points.
We give an improved algorithm, termed the hybrid exponential weights
algorithm, that achieves an expected excess risk of $O((\log\log|\mathcal{L}| +
\log\log n)/\log n)$. This result gives a doubly exponential improvement in the
dependence on $|\mathcal{L}|$ over the best known bound of
$O(\sqrt{|\mathcal{L}|/\log n})$. We complement the positive result with an
almost matching lower bound, which suggests the worst-case optimality of the
algorithm.
We also study a more restrictive family of learning algorithms that are
bounded-recall in the sense that when a prediction window of length $w$ is
chosen, the learner's decision only depends on the most recent $w$ data points.
We analyze an exponential weights variant of the ERM algorithm in Qiao and
Valiant (2019). This new algorithm achieves an expected excess risk of
$O(\sqrt{\log |\mathcal{L}|/\log n})$, which is shown to be nearly optimal
among all bounded-recall learners. Our analysis builds on a generalized version
of the selective mean prediction problem in Drucker (2013); Qiao and Valiant
(2019), which may be of independent interest.
Related papers
- Iterative thresholding for non-linear learning in the strong $\varepsilon$-contamination model [3.309767076331365]
We derive approximation bounds for learning single neuron models using thresholded descent.
We also study the linear regression problem, where $sigma(mathbfx) = mathbfx$.
arXiv Detail & Related papers (2024-09-05T16:59:56Z) - Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method [10.742859956268655]
This paper aims to recover the intrinsic model parameters given the leverage scores gradient.
We specifically scrutinize the inversion of the leverage score gradient, denoted as $g(x)$.
arXiv Detail & Related papers (2024-08-21T01:39:42Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Nearly Minimax Optimal Reinforcement Learning with Linear Function
Approximation [25.60689712525918]
We study reinforcement learning with linear function approximation where the transition probability and reward functions are linear.
We propose a novel-efficient algorithm, LSVI-UCB$+$, which achieves an $widetildeO(HdsqrtT)$ regret bound where $H$ is the episode length, $d$ is the feature dimension, and $T$ is the number of steps.
arXiv Detail & Related papers (2022-06-23T06:04:21Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov
Decision Processes [91.38793800392108]
We study reinforcement learning with linear function approximation where the underlying transition probability kernel of the Markov decision process (MDP) is a linear mixture model.
We propose a new, computationally efficient algorithm with linear function approximation named $textUCRL-VTR+$ for the aforementioned linear mixture MDPs.
To the best of our knowledge, these are the first computationally efficient, nearly minimax optimal algorithms for RL with linear function approximation.
arXiv Detail & Related papers (2020-12-15T18:56:46Z) - Active Local Learning [22.823171570933397]
We consider active local learning: given a query point $x$, and active access to an unlabeled training set $S$, output the prediction $h(x)$ of a near-optimal $h in H$.
In particular, the number of label queries should be independent of the complexity of $H$, and the function $h$ should be well-defined, independent of $x$.
This immediately also implies an algorithm for distance estimation: estimating the value $opt(H)$ from many fewer labels than needed to actually learn a near-optimal $h in
arXiv Detail & Related papers (2020-08-31T05:39:35Z) - Adapting $k$-means algorithms for outliers [1.9290392443571387]
We show how to adapt several simple sampling-based algorithms for the $k$-means problem to the setting with outliers.
Our algorithms output $(varepsilon)z$ outliers while achieving an $O(varepsilon)$-approximation to the objective function.
arXiv Detail & Related papers (2020-07-02T14:14:33Z) - $Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z) - Agnostic Q-learning with Function Approximation in Deterministic
Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems.
We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.