Fast and Robust Online Inference with Stochastic Gradient Descent via
Random Scaling
- URL: http://arxiv.org/abs/2106.03156v1
- Date: Sun, 6 Jun 2021 15:38:37 GMT
- Title: Fast and Robust Online Inference with Stochastic Gradient Descent via
Random Scaling
- Authors: Sokbae Lee, Yuan Liao, Myung Hwan Seo, Youngki Shin
- Abstract summary: We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms.
Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
- Score: 0.9806910643086042
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We develop a new method of online inference for a vector of parameters
estimated by the Polyak-Ruppert averaging procedure of stochastic gradient
descent (SGD) algorithms. We leverage insights from time series regression in
econometrics and construct asymptotically pivotal statistics via random
scaling. Our approach is fully operational with online data and is rigorously
underpinned by a functional central limit theorem. Our proposed inference
method has a couple of key advantages over the existing methods. First, the
test statistic is computed in an online fashion with only SGD iterates and the
critical values can be obtained without any resampling methods, thereby
allowing for efficient implementation suitable for massive online data. Second,
there is no need to estimate the asymptotic variance and our inference method
is shown to be robust to changes in the tuning parameters for SGD algorithms in
simulation experiments with synthetic data.
Related papers
- Statistical Inference for Temporal Difference Learning with Linear Function Approximation [62.69448336714418]
Temporal Difference (TD) learning, arguably the most widely used for policy evaluation, serves as a natural framework for this purpose.
In this paper, we study the consistency properties of TD learning with Polyak-Ruppert averaging and linear function approximation, and obtain three significant improvements over existing results.
arXiv Detail & Related papers (2024-10-21T15:34:44Z) - Adaptive debiased SGD in high-dimensional GLMs with streaming data [4.704144189806667]
We introduce a novel approach to online inference in high-dimensional generalized linear models.
Our method operates in a single-pass mode, significantly reducing both time and space complexity.
We demonstrate that our method, termed the Approximated Debiased Lasso (ADL), not only mitigates the need for the bounded individual probability condition but also significantly improves numerical performance.
arXiv Detail & Related papers (2024-05-28T15:36:48Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - Online Learning Under A Separable Stochastic Approximation Framework [20.26530917721778]
We propose an online learning algorithm for a class of machine learning models under a separable approximation framework.
We show that the proposed algorithm produces more robust and test performance when compared to other popular learning algorithms.
arXiv Detail & Related papers (2023-05-12T13:53:03Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
We consider online statistical inference of constrained nonlinear optimization problems.
We apply the Sequential Quadratic Programming (StoSQP) method to solve these problems.
arXiv Detail & Related papers (2022-05-27T00:34:03Z) - Learning Sampling Policy for Faster Derivative Free Optimization [100.27518340593284]
We propose a new reinforcement learning based ZO algorithm (ZO-RL) with learning the sampling policy for generating the perturbations in ZO optimization instead of using random sampling.
Our results show that our ZO-RL algorithm can effectively reduce the variances of ZO gradient by learning a sampling policy, and converge faster than existing ZO algorithms in different scenarios.
arXiv Detail & Related papers (2021-04-09T14:50:59Z) - High-dimensional, multiscale online changepoint detection [7.502070498889449]
We introduce a new method for high-dimensional, online changepoint detection in settings where a $p$- Gaussian data stream may undergo a change in mean.
The algorithm is online in the sense that both its storage requirements and worst-case computational complexity per new observation are independent of the number of previous observations.
Simulations confirm the practical effectiveness of our proposal, which is implemented in the R package 'ocd', and we also demonstrate its utility on a seismology data set.
arXiv Detail & Related papers (2020-03-07T21:54:09Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - Online Covariance Matrix Estimation in Stochastic Gradient Descent [10.153224593032677]
gradient descent (SGD) is widely used for parameter estimation especially for huge data sets and online learning.
This paper aims at quantifying statistical inference of SGD-based estimates in an online setting.
arXiv Detail & Related papers (2020-02-10T17:46:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.