Stochastic Online Linear Regression: the Forward Algorithm to Replace
Ridge
- URL: http://arxiv.org/abs/2111.01602v1
- Date: Tue, 2 Nov 2021 13:57:53 GMT
- Title: Stochastic Online Linear Regression: the Forward Algorithm to Replace
Ridge
- Authors: Reda Ouhamma, Odalric Maillard, Vianney Perchet
- Abstract summary: We derive high probability regret bounds for online ridge regression and the forward algorithm.
This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions.
- Score: 24.880035784304834
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We consider the problem of online linear regression in the stochastic
setting. We derive high probability regret bounds for online ridge regression
and the forward algorithm. This enables us to compare online regression
algorithms more accurately and eliminate assumptions of bounded observations
and predictions. Our study advocates for the use of the forward algorithm in
lieu of ridge due to its enhanced bounds and robustness to the regularization
parameter. Moreover, we explain how to integrate it in algorithms involving
linear function approximation to remove a boundedness assumption without
deteriorating theoretical bounds. We showcase this modification in linear
bandit settings where it yields improved regret bounds. Last, we provide
numerical experiments to illustrate our results and endorse our intuitions.
Related papers
- Refined Risk Bounds for Unbounded Losses via Transductive Priors [58.967816314671296]
We revisit the sequential variants of linear regression with the squared loss, classification problems with hinge loss, and logistic regression.
Our key tools are based on the exponential weights algorithm with carefully chosen transductive priors.
arXiv Detail & Related papers (2024-10-29T00:01:04Z) - Online and Offline Robust Multivariate Linear Regression [0.3277163122167433]
We introduce two methods each considered contrast: (i) online gradient descent algorithms and their averaged versions and (ii) offline fix-point algorithms.
Because the variance matrix of the noise is usually unknown, we propose to plug a robust estimate of it in the Mahalanobis-based gradient descent algorithms.
arXiv Detail & Related papers (2024-04-30T12:30:48Z) - Stable Nonconvex-Nonconcave Training via Linear Interpolation [51.668052890249726]
This paper presents a theoretical analysis of linearahead as a principled method for stabilizing (large-scale) neural network training.
We argue that instabilities in the optimization process are often caused by the nonmonotonicity of the loss landscape and show how linear can help by leveraging the theory of nonexpansive operators.
arXiv Detail & Related papers (2023-10-20T12:45:12Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - Implicit Regularization for Group Sparsity [33.487964460794764]
We show that gradient descent over the squared regression loss, without any explicit regularization, biases towards solutions with a group sparsity structure.
We analyze the gradient dynamics of the corresponding regression problem in the general noise setting and obtain minimax-optimal error rates.
In the degenerate case of size-one groups, our approach gives rise to a new algorithm for sparse linear regression.
arXiv Detail & Related papers (2023-01-29T20:54:03Z) - Shuffled linear regression through graduated convex relaxation [12.614901374282868]
The shuffled linear regression problem aims to recover linear relationships in datasets where the correspondence between input and output is unknown.
This problem arises in a wide range of applications including survey data.
We propose a novel optimization algorithm for shuffled linear regression based on a posterior-maximizing objective function.
arXiv Detail & Related papers (2022-09-30T17:33:48Z) - Dynamic Regret for Strongly Adaptive Methods and Optimality of Online
KRR [13.165557713537389]
We show that Strongly Adaptive (SA) algorithms can be viewed as a principled way of controlling dynamic regret.
We derive a new lower bound on a certain penalized regret which establishes the near minimax optimality of online Kernel Ridge Regression (KRR)
arXiv Detail & Related papers (2021-11-22T21:52:47Z) - Optimal Rates for Random Order Online Optimization [60.011653053877126]
We study the citetgarber 2020online, where the loss functions may be chosen by an adversary, but are then presented online in a uniformly random order.
We show that citetgarber 2020online algorithms achieve the optimal bounds and significantly improve their stability.
arXiv Detail & Related papers (2021-06-29T09:48:46Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.