Convergence analysis of online algorithms for vector-valued kernel regression
- URL: http://arxiv.org/abs/2309.07779v2
- Date: Mon, 29 Apr 2024 07:20:27 GMT
- Title: Convergence analysis of online algorithms for vector-valued kernel regression
- Authors: Michael Griebel, Peter Oswald,
- Abstract summary: We consider the problem of approximating the regression function from noisy vector-valued data by an online learning algorithm.
We show that the expected squared error in the RKHS norm can be bounded by $C2 (m+1)-s/(2+s)$, where $m$ is the current number of processed data.
- Score: 0.42970700836450487
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of approximating the regression function from noisy vector-valued data by an online learning algorithm using an appropriate reproducing kernel Hilbert space (RKHS) as prior. In an online algorithm, i.i.d. samples become available one by one by a random process and are successively processed to build approximations to the regression function. We are interested in the asymptotic performance of such online approximation algorithms and show that the expected squared error in the RKHS norm can be bounded by $C^2 (m+1)^{-s/(2+s)}$, where $m$ is the current number of processed data, the parameter $0<s\leq 1$ expresses an additional smoothness assumption on the regression function and the constant $C$ depends on the variance of the input noise, the smoothness of the regression function and further parameters of the algorithm.
Related papers
- Stochastic Optimization Algorithms for Instrumental Variable Regression with Streaming Data [17.657917523817243]
We develop and analyze algorithms for instrumental variable regression by viewing the problem as a conditional optimization problem.
In the context of least-squares instrumental variable regression, our algorithms neither require matrix inversions nor mini-batches.
We derive rates of convergence in expectation, that are of order $mathcalO(log T/T)$ and $mathcalO (1/T1-iota)$ for any $iota>0$.
arXiv Detail & Related papers (2024-05-29T19:21:55Z) - Online and Offline Robust Multivariate Linear Regression [0.3277163122167433]
We introduce two methods each considered contrast: (i) online gradient descent algorithms and their averaged versions and (ii) offline fix-point algorithms.
Because the variance matrix of the noise is usually unknown, we propose to plug a robust estimate of it in the Mahalanobis-based gradient descent algorithms.
arXiv Detail & Related papers (2024-04-30T12:30:48Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - On the Linear Convergence of Policy Gradient under Hadamard
Parameterization [4.182089296199263]
We study the convergence of deterministic policy gradient under the Hadamard parameterization.
We show that the error decreases at an $O(frac1k)$ rate for all the iterations.
arXiv Detail & Related papers (2023-05-31T05:51:15Z) - Refined Regret for Adversarial MDPs with Linear Function Approximation [50.00022394876222]
We consider learning in an adversarial Decision Process (MDP) where the loss functions can change arbitrarily over $K$ episodes.
This paper provides two algorithms that improve the regret to $tildemathcal O(K2/3)$ in the same setting.
arXiv Detail & Related papers (2023-01-30T14:37:21Z) - A gradient estimator via L1-randomization for online zero-order
optimization with two point feedback [93.57603470949266]
We present a novel gradient estimator based on two function evaluation and randomization.
We consider two types of assumptions on the noise of the zero-order oracle: canceling noise and adversarial noise.
We provide an anytime and completely data-driven algorithm, which is adaptive to all parameters of the problem.
arXiv Detail & Related papers (2022-05-27T11:23:57Z) - Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
We consider online statistical inference of constrained nonlinear optimization problems.
We apply the Sequential Quadratic Programming (StoSQP) method to solve these problems.
arXiv Detail & Related papers (2022-05-27T00:34:03Z) - Fast and Robust Online Inference with Stochastic Gradient Descent via
Random Scaling [0.9806910643086042]
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms.
Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
arXiv Detail & Related papers (2021-06-06T15:38:37Z) - Online nonparametric regression with Sobolev kernels [99.12817345416846]
We derive the regret upper bounds on the classes of Sobolev spaces $W_pbeta(mathcalX)$, $pgeq 2, beta>fracdp$.
The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> fracd2$ or $p=infty$ these rates are (essentially) optimal.
arXiv Detail & Related papers (2021-02-06T15:05:14Z) - Tight Nonparametric Convergence Rates for Stochastic Gradient Descent
under the Noiseless Linear Model [0.0]
We analyze the convergence of single-pass, fixed step-size gradient descent on the least-square risk under this model.
As a special case, we analyze an online algorithm for estimating a real function on the unit interval from the noiseless observation of its value at randomly sampled points.
arXiv Detail & Related papers (2020-06-15T08:25:50Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.