Online Regularized Statistical Learning in Reproducing Kernel Hilbert Space With Non-Stationary Data
- URL: http://arxiv.org/abs/2404.03211v6
- Date: Thu, 23 Oct 2025 08:32:49 GMT
- Title: Online Regularized Statistical Learning in Reproducing Kernel Hilbert Space With Non-Stationary Data
- Authors: Xiwei Zhang, Yan Chen, Tao Li,
- Abstract summary: We study regularized learning algorithms in the reproducing kernel Hilbert space (RKHS) with non-stationary online data streams.<n>We show that the tracking error vanishes in mean square if the regularization path is slowly time-varying.
- Score: 6.688386258466601
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study recursive regularized learning algorithms in the reproducing kernel Hilbert space (RKHS) with non-stationary online data streams. We introduce the concept of random Tikhonov regularization path and decompose the tracking error of the algorithm's output for the regularization path into random difference equations in RKHS. We show that the tracking error vanishes in mean square if the regularization path is slowly time-varying. Then, leveraging the monotonicity of inverse operators and the spectral decomposition of compact operators, and introducing the RKHS persistence of excitation condition, we develop a dominated convergence method to prove the mean square consistency between the regularization path and the unknown function to be learned. Especially, for independent and non-identically distributed data streams, the mean square consistency between the algorithm's output and the unknown function is achieved if the input data's marginal probability measures are slowly time-varying and the average measure over each fixed-length time period has a uniformly strictly positive lower bound.
Related papers
- Online Inference for Quantiles by Constant Learning-Rate Stochastic Gradient Descent [4.2694059987063655]
This paper proposes an online inference method of the gradient descent (SGD) with a constant learning rate for quantile loss functions with theoretical guarantees.<n> Numerical studies demonstrate strong finite-sample performance of our proposed quantile estimator and inference method.
arXiv Detail & Related papers (2025-03-04T01:37:42Z) - Asymptotic Time-Uniform Inference for Parameters in Averaged Stochastic Approximation [23.89036529638614]
We study time-uniform statistical inference for parameters in approximation (SA)
We analyze the almost-sure convergence rates of the averaged iterates to a scaled sum of Gaussians in both linear and nonlinear SA problems.
arXiv Detail & Related papers (2024-10-19T10:27:26Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Non-Parametric Learning of Stochastic Differential Equations with Non-asymptotic Fast Rates of Convergence [65.63201894457404]
We propose a novel non-parametric learning paradigm for the identification of drift and diffusion coefficients of non-linear differential equations.<n>The key idea essentially consists of fitting a RKHS-based approximation of the corresponding Fokker-Planck equation to such observations.
arXiv Detail & Related papers (2023-05-24T20:43:47Z) - Decentralized Online Learning for Random Inverse Problems Over Graphs [6.423798607256407]
We develop the convergence of the stability of the algorithm in Hilbert spaces with $_$-bounded martingale difference terms.
We show that if the network graph is connected and the sequence of forward operators satisfies the infinite-dimensional-temporal persistence of excitation condition, then the estimates of all nodes are mean square.
We propose a decentralized online learning algorithm in RKHS based on non-stationary online data.
arXiv Detail & Related papers (2023-03-20T08:37:08Z) - Decentralized Online Regularized Learning Over Random Time-Varying Graphs [4.065547532041163]
We develop the online regularized linear regression algorithm over random time-varying graphs.
We prove that the regret upper bound is $O(T1-tauln T)$, where $tauin (0.5,1)$ is a constant depending on the algorithm gains.
In addition, we prove that the regret upper bound is $O(T1-tauln T)$, where $tauin (0.5,1)$ is a constant depending on the algorithm gains.
arXiv Detail & Related papers (2022-06-07T12:55:08Z) - Online Time Series Anomaly Detection with State Space Gaussian Processes [12.483273106706623]
R-ssGPFA is an unsupervised online anomaly detection model for uni- and multivariate time series.
For high-dimensional time series, we propose an extension of Gaussian process factor analysis to identify the common latent processes of the time series.
Our model's robustness is improved by using a simple to skip Kalman updates when encountering anomalous observations.
arXiv Detail & Related papers (2022-01-18T06:43:32Z) - Spatio-Temporal Variational Gaussian Processes [26.60276485130467]
We introduce a scalable approach to Gaussian process inference that combinestemporal-temporal filtering with natural variational inference.
We derive a sparse approximation that constructs a state-space model over a reduced set of inducing points.
We show that for separable Markov kernels the full sparse cases recover exactly the standard variational GP.
arXiv Detail & Related papers (2021-11-02T16:53:31Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - The Connection between Discrete- and Continuous-Time Descriptions of
Gaussian Continuous Processes [60.35125735474386]
We show that discretizations yielding consistent estimators have the property of invariance under coarse-graining'
This result explains why combining differencing schemes for derivatives reconstruction and local-in-time inference approaches does not work for time series analysis of second or higher order differential equations.
arXiv Detail & Related papers (2021-01-16T17:11:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.