Convergence Conditions of Online Regularized Statistical Learning in Reproducing Kernel Hilbert Space With Non-Stationary Data
- URL: http://arxiv.org/abs/2404.03211v4
- Date: Sun, 9 Jun 2024 13:11:36 GMT
- Title: Convergence Conditions of Online Regularized Statistical Learning in Reproducing Kernel Hilbert Space With Non-Stationary Data
- Authors: Xiwei Zhang, Tao Li,
- Abstract summary: We study the convergence of regularized learning algorithms in the reproducing kernel HilbertRKHS with dependent and non-stationary online data streams.
For independent and non-identically distributed data streams, the algorithm achieves the mean square consistency.
- Score: 4.5692679976952215
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the convergence of recursive regularized learning algorithms in the reproducing kernel Hilbert space (RKHS) with dependent and non-stationary online data streams. Firstly, we study the mean square asymptotic stability of a class of random difference equations in RKHS, whose non-homogeneous terms are martingale difference sequences dependent on the homogeneous ones. Secondly, we introduce the concept of random Tikhonov regularization path, and show that if the regularization path is slowly time-varying in some sense, then the output of the algorithm is consistent with the regularization path in mean square. Furthermore, if the data streams also satisfy the RKHS persistence of excitation condition, i.e. there exists a fixed length of time period, such that the conditional expectation of the operators induced by the input data accumulated over every time period has a uniformly strictly positive compact lower bound in the sense of the operator order with respect to time, then the output of the algorithm is consistent with the unknown function in mean square. Finally, for the case with independent and non-identically distributed data streams, the algorithm achieves the mean square consistency provided the marginal probability measures induced by the input data are slowly time-varying and the average measure over each fixed-length time period has a uniformly strictly positive lower bound.
Related papers
- Asymptotic Time-Uniform Inference for Parameters in Averaged Stochastic Approximation [23.89036529638614]
We study time-uniform statistical inference for parameters in approximation (SA)
We analyze the almost-sure convergence rates of the averaged iterates to a scaled sum of Gaussians in both linear and nonlinear SA problems.
arXiv Detail & Related papers (2024-10-19T10:27:26Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Decentralized Online Learning for Random Inverse Problems Over Graphs [6.423798607256407]
We develop the convergence of the stability of the algorithm in Hilbert spaces with $_$-bounded martingale difference terms.
We show that if the network graph is connected and the sequence of forward operators satisfies the infinite-dimensional-temporal persistence of excitation condition, then the estimates of all nodes are mean square.
We propose a decentralized online learning algorithm in RKHS based on non-stationary online data.
arXiv Detail & Related papers (2023-03-20T08:37:08Z) - Decentralized Online Regularized Learning Over Random Time-Varying Graphs [4.065547532041163]
We develop the online regularized linear regression algorithm over random time-varying graphs.
We prove that the regret upper bound is $O(T1-tauln T)$, where $tauin (0.5,1)$ is a constant depending on the algorithm gains.
In addition, we prove that the regret upper bound is $O(T1-tauln T)$, where $tauin (0.5,1)$ is a constant depending on the algorithm gains.
arXiv Detail & Related papers (2022-06-07T12:55:08Z) - Spatio-Temporal Variational Gaussian Processes [26.60276485130467]
We introduce a scalable approach to Gaussian process inference that combinestemporal-temporal filtering with natural variational inference.
We derive a sparse approximation that constructs a state-space model over a reduced set of inducing points.
We show that for separable Markov kernels the full sparse cases recover exactly the standard variational GP.
arXiv Detail & Related papers (2021-11-02T16:53:31Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - The Connection between Discrete- and Continuous-Time Descriptions of
Gaussian Continuous Processes [60.35125735474386]
We show that discretizations yielding consistent estimators have the property of invariance under coarse-graining'
This result explains why combining differencing schemes for derivatives reconstruction and local-in-time inference approaches does not work for time series analysis of second or higher order differential equations.
arXiv Detail & Related papers (2021-01-16T17:11:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.