Online Covariance Matrix Estimation in Stochastic Gradient Descent
- URL: http://arxiv.org/abs/2002.03979v3
- Date: Tue, 22 Jun 2021 15:16:16 GMT
- Title: Online Covariance Matrix Estimation in Stochastic Gradient Descent
- Authors: Wanrong Zhu, Xi Chen, Wei Biao Wu
- Abstract summary: gradient descent (SGD) is widely used for parameter estimation especially for huge data sets and online learning.
This paper aims at quantifying statistical inference of SGD-based estimates in an online setting.
- Score: 10.153224593032677
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The stochastic gradient descent (SGD) algorithm is widely used for parameter
estimation, especially for huge data sets and online learning. While this
recursive algorithm is popular for computation and memory efficiency,
quantifying variability and randomness of the solutions has been rarely
studied. This paper aims at conducting statistical inference of SGD-based
estimates in an online setting. In particular, we propose a fully online
estimator for the covariance matrix of averaged SGD iterates (ASGD) only using
the iterates from SGD. We formally establish our online estimator's consistency
and show that the convergence rate is comparable to offline counterparts. Based
on the classic asymptotic normality results of ASGD, we construct
asymptotically valid confidence intervals for model parameters. Upon receiving
new observations, we can quickly update the covariance matrix estimate and the
confidence intervals. This approach fits in an online setting and takes full
advantage of SGD: efficiency in computation and memory.
Related papers
- Adaptive debiased SGD in high-dimensional GLMs with streaming data [4.704144189806667]
We introduce a novel approach to online inference in high-dimensional generalized linear models.
Our method operates in a single-pass mode, significantly reducing both time and space complexity.
We demonstrate that our method, termed the Approximated Debiased Lasso (ADL), not only mitigates the need for the bounded individual probability condition but also significantly improves numerical performance.
arXiv Detail & Related papers (2024-05-28T15:36:48Z) - Online and Offline Robust Multivariate Linear Regression [0.3277163122167433]
We introduce two methods each considered contrast: (i) online gradient descent algorithms and their averaged versions and (ii) offline fix-point algorithms.
Because the variance matrix of the noise is usually unknown, we propose to plug a robust estimate of it in the Mahalanobis-based gradient descent algorithms.
arXiv Detail & Related papers (2024-04-30T12:30:48Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Statistical Inference for Linear Functionals of Online SGD in High-dimensional Linear Regression [14.521929085104441]
We establish a high-dimensional Central Limit Theorem (CLT) for linear functionals of online gradient descent (SGD)
We develop an online approach for estimating the expectation and the variance terms appearing in the CLT, and establish high-probability bounds for the developed online estimator.
We propose a two-step fully online bias-correction methodology which together with the CLT result and the variance estimation result, provides a fully online and data-driven way to numerically construct confidence intervals.
arXiv Detail & Related papers (2023-02-20T02:38:36Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner.
We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation.
We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z) - Fast and Robust Online Inference with Stochastic Gradient Descent via
Random Scaling [0.9806910643086042]
We develop a new method of online inference for a vector of parameters estimated by the Polyak-Rtupper averaging procedure of gradient descent algorithms.
Our approach is fully operational with online data and is rigorously underpinned by a functional central limit theorem.
arXiv Detail & Related papers (2021-06-06T15:38:37Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Statistical Inference for Model Parameters in Stochastic Gradient
Descent [45.29532403359099]
gradient descent coefficients (SGD) has been widely used in statistical estimation for large-scale data due to its computational and memory efficiency.
We investigate the problem of statistical inference of true model parameters based on SGD when the population loss function is strongly convex and satisfies certain conditions.
arXiv Detail & Related papers (2016-10-27T07:04:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.