Implicit Parameter-free Online Learning with Truncated Linear Models
- URL: http://arxiv.org/abs/2203.10327v1
- Date: Sat, 19 Mar 2022 13:39:49 GMT
- Title: Implicit Parameter-free Online Learning with Truncated Linear Models
- Authors: Keyi Chen and Ashok Cutkosky and Francesco Orabona
- Abstract summary: parameter-free algorithms are online learning algorithms that do not require setting learning rates.
We propose new parameter-free algorithms that can take advantage of truncated linear models through a new update that has an "implicit" flavor.
Based on a novel decomposition of the regret, the new update is efficient, requires only one gradient at each step, never overshoots the minimum of the truncated model, and retains the favorable parameter-free properties.
- Score: 51.71216912089413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parameter-free algorithms are online learning algorithms that do not require
setting learning rates. They achieve optimal regret with respect to the
distance between the initial point and any competitor. Yet, parameter-free
algorithms do not take into account the geometry of the losses. Recently, in
the stochastic optimization literature, it has been proposed to instead use
truncated linear lower bounds, which produce better performance by more closely
modeling the losses. In particular, truncated linear models greatly reduce the
problem of overshooting the minimum of the loss function. Unfortunately,
truncated linear models cannot be used with parameter-free algorithms because
the updates become very expensive to compute. In this paper, we propose new
parameter-free algorithms that can take advantage of truncated linear models
through a new update that has an "implicit" flavor. Based on a novel
decomposition of the regret, the new update is efficient, requires only one
gradient at each step, never overshoots the minimum of the truncated model, and
retains the favorable parameter-free properties. We also conduct an empirical
study demonstrating the practical utility of our algorithms.
Related papers
- Zeroth-Order Adaptive Neuron Alignment Based Pruning without Re-Training [3.195234044113248]
We exploit functional information from dense pre-trained models to obtain sparse models that maximize the activations' alignment w.r.t.
We propose textscNeuroAl, a emphtop-up algorithm that modifies the block-wise and row-wise sparsity ratios to maximize the emphneuron alignment among activations.
We test our method on 4 different LLM families and 3 different sparsity ratios, showing how it consistently outperforms the latest state-of-the-art techniques.
arXiv Detail & Related papers (2024-11-11T15:30:16Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - Regret-Optimal Model-Free Reinforcement Learning for Discounted MDPs
with Short Burn-In Time [13.545356254920584]
We introduce a model-free algorithm that employs variance reduction and a novel technique that switches the execution policy in a slow-yet-adaptive manner.
This is the first regret-optimal model-free algorithm in the discounted setting, with the additional benefit of a low burn-in time.
arXiv Detail & Related papers (2023-05-24T20:22:43Z) - Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework.
We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z) - Statistical Complexity and Optimal Algorithms for Non-linear Ridge
Bandits [39.391636494257284]
We consider the sequential decision-making problem where the mean outcome is a non-linear function of the chosen action.
In particular, a two-stage algorithm that first finds a good initial action and then treats the problem as locally linear is statistically optimal.
arXiv Detail & Related papers (2023-02-12T23:20:41Z) - Smoothed Online Learning is as Easy as Statistical Learning [77.00766067963195]
We provide the first oracle-efficient, no-regret algorithms in this setting.
We show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits.
arXiv Detail & Related papers (2022-02-09T19:22:34Z) - Adapting to Misspecification in Contextual Bandits [82.55565343668246]
We introduce a new family of oracle-efficient algorithms for $varepsilon$-misspecified contextual bandits.
We obtain the first algorithm that achieves the optimal $O(dsqrtT + varepsilonsqrtdT)$ regret bound for unknown misspecification level.
arXiv Detail & Related papers (2021-07-12T21:30:41Z) - Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms
for Stochastic Shortest Path [29.289190242826688]
We introduce a generic template for developing regret algorithms in the Shortest Path (SSP) model.
We develop two new algorithms: the first is model-free and minimax optimal under strictly positive costs.
Both algorithms admit highly sparse updates, making them computationally more efficient than all existing algorithms.
arXiv Detail & Related papers (2021-06-15T19:15:17Z) - LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning.
LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z) - Online Model Selection for Reinforcement Learning with Function
Approximation [50.008542459050155]
We present a meta-algorithm that adapts to the optimal complexity with $tildeO(L5/6 T2/3)$ regret.
We also show that the meta-algorithm automatically admits significantly improved instance-dependent regret bounds.
arXiv Detail & Related papers (2020-11-19T10:00:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.