Heavy-tailed Streaming Statistical Estimation
- URL: http://arxiv.org/abs/2108.11483v1
- Date: Wed, 25 Aug 2021 21:30:27 GMT
- Title: Heavy-tailed Streaming Statistical Estimation
- Authors: Che-Ping Tsai, Adarsh Prasad, Sivaraman Balakrishnan, Pradeep
Ravikumar
- Abstract summary: We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
- Score: 58.70341336199497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the task of heavy-tailed statistical estimation given streaming
$p$-dimensional samples. This could also be viewed as stochastic optimization
under heavy-tailed distributions, with an additional $O(p)$ space complexity
constraint. We design a clipped stochastic gradient descent algorithm and
provide an improved analysis, under a more nuanced condition on the noise of
the stochastic gradients, which we show is critical when analyzing stochastic
optimization problems arising from general statistical estimation problems. Our
results guarantee convergence not just in expectation but with exponential
concentration, and moreover does so using $O(1)$ batch size. We provide
consequences of our results for mean estimation and linear regression. Finally,
we provide empirical corroboration of our results and algorithms via synthetic
experiments for mean estimation and linear regression.
Related papers
- Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - Sampling from Gaussian Process Posteriors using Stochastic Gradient
Descent [43.097493761380186]
gradient algorithms are an efficient method of approximately solving linear systems.
We show that gradient descent produces accurate predictions, even in cases where it does not converge quickly to the optimum.
Experimentally, gradient descent achieves state-of-the-art performance on sufficiently large-scale or ill-conditioned regression tasks.
arXiv Detail & Related papers (2023-06-20T15:07:37Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - SLOE: A Faster Method for Statistical Inference in High-Dimensional
Logistic Regression [68.66245730450915]
We develop an improved method for debiasing predictions and estimating frequentist uncertainty for practical datasets.
Our main contribution is SLOE, an estimator of the signal strength with convergence guarantees that reduces the computation time of estimation and inference by orders of magnitude.
arXiv Detail & Related papers (2021-03-23T17:48:56Z) - A Nonconvex Framework for Structured Dynamic Covariance Recovery [24.471814126358556]
We propose a flexible yet interpretable model for high-dimensional data with time-varying second order statistics.
Motivated by the literature, we quantify factorization and smooth temporal data.
We show that our approach outperforms existing baselines.
arXiv Detail & Related papers (2020-11-11T07:09:44Z) - Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic
Bounds and Applications [0.6445605125467573]
gradient estimation is of crucial importance in statistics and learning theory.
We consider here the classic regression setup, where a real valued square integrable r.v. $Y$ is to be predicted.
We prove nonasymptotic bounds improving upon those obtained for alternative estimation methods.
arXiv Detail & Related papers (2020-06-26T15:19:43Z) - Instability, Computational Efficiency and Statistical Accuracy [101.32305022521024]
We develop a framework that yields statistical accuracy based on interplay between the deterministic convergence rate of the algorithm at the population level, and its degree of (instability) when applied to an empirical object based on $n$ samples.
We provide applications of our general results to several concrete classes of models, including Gaussian mixture estimation, non-linear regression models, and informative non-response models.
arXiv Detail & Related papers (2020-05-22T22:30:52Z) - An Optimal Statistical and Computational Framework for Generalized
Tensor Estimation [10.899518267165666]
This paper describes a flexible framework for low-rank tensor estimation problems.
It includes many important instances from applications in computational imaging, genomics, and network analysis.
arXiv Detail & Related papers (2020-02-26T01:54:35Z) - Maximum likelihood estimation and uncertainty quantification for
Gaussian process approximation of deterministic functions [10.319367855067476]
This article provides one of the first theoretical analyses in the context of Gaussian process regression with a noiseless dataset.
We show that the maximum likelihood estimation of the scale parameter alone provides significant adaptation against misspecification of the Gaussian process model.
arXiv Detail & Related papers (2020-01-29T17:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.