Generalized Random Direction Newton Algorithms for Stochastic Optimization
- URL: http://arxiv.org/abs/2602.19893v1
- Date: Mon, 23 Feb 2026 14:33:39 GMT
- Title: Generalized Random Direction Newton Algorithms for Stochastic Optimization
- Authors: Soumen Pachal, Prashanth L. A., Shalabh Bhatnagar, Avinash Achar,
- Abstract summary: We present a family of generalized Hessian estimators of the objective using random direction of approximation (RDSA)<n>We show that estimators with more function measurements exhibit lower-order estimation bias.<n>We also perform numerical and non-asymptotic convergence analyses for Newton methods that incorporate our generalized Hessian estimators.
- Score: 10.325784284122124
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a family of generalized Hessian estimators of the objective using random direction stochastic approximation (RDSA) by utilizing only noisy function measurements. The form of each estimator and the order of the bias depend on the number of function measurements. In particular, we demonstrate that estimators with more function measurements exhibit lower-order estimation bias. We show the asymptotic unbiasedness of the estimators. We also perform asymptotic and non-asymptotic convergence analyses for stochastic Newton methods that incorporate our generalized Hessian estimators. Finally, we perform numerical experiments to validate our theoretical findings.
Related papers
- Efficient Covariance Estimation for Sparsified Functional Data [51.69796254617083]
proposed Random-knots (Random-knots-Spatial) and B-spline (Bspline-Spatial) estimators of the covariance function are computationally efficient.<n>Asymptotic pointwise of the covariance are obtained for sparsified individual trajectories under some regularity conditions.
arXiv Detail & Related papers (2025-11-23T00:50:33Z) - On the Optimal Construction of Unbiased Gradient Estimators for Zeroth-Order Optimization [57.179679246370114]
A potential limitation of existing methods is the bias inherent in most perturbation estimators unless a stepsize is proposed.<n>We propose a novel family of unbiased gradient scaling estimators that eliminate bias while maintaining favorable construction.
arXiv Detail & Related papers (2025-10-22T18:25:43Z) - Statistical Inference for Online Algorithms [2.7624021966289605]
We propose a new type of inference based on the output of online algorithms em without estimating the variance.<n>We focus on gradient descent with Polyak averaging to understand the practical performance of the proposed method.
arXiv Detail & Related papers (2025-05-22T21:31:49Z) - A weak convergence approach to large deviations for stochastic approximations [0.9374652839580183]
We prove a large deviation principle for general approximations with state-dependent Markovian noise and decreasing step size.<n> Examples of learning algorithms that are covered include gradient descent, persistent contrastive divergence and the Wang-Landau algorithm.
arXiv Detail & Related papers (2025-02-04T17:50:30Z) - Kernel-based off-policy estimation without overlap: Instance optimality
beyond semiparametric efficiency [53.90687548731265]
We study optimal procedures for estimating a linear functional based on observational data.
For any convex and symmetric function class $mathcalF$, we derive a non-asymptotic local minimax bound on the mean-squared error.
arXiv Detail & Related papers (2023-01-16T02:57:37Z) - Generalized Simultaneous Perturbation-based Gradient Search with Reduced
Estimator Bias [7.372983005764439]
We present a family of generalized simultaneous perturbation-based search (GSPGS) estimators that use noisy function measurements.
The number of function measurements required by each estimator is guided by the desired level of accuracy.
arXiv Detail & Related papers (2022-12-20T17:50:36Z) - Tuning Stochastic Gradient Algorithms for Statistical Inference via
Large-Sample Asymptotics [18.93569692490218]
tuning of gradient algorithms often based on trial-and-error rather than generalizable theory.
We show that averaging with a large fixed step size is robust to the choice of tuning parameters.
We lay the foundation for a systematic analysis of other gradient Monte Carlo algorithms.
arXiv Detail & Related papers (2022-07-25T17:58:09Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - A Stochastic Newton Algorithm for Distributed Convex Optimization [62.20732134991661]
We analyze a Newton algorithm for homogeneous distributed convex optimization, where each machine can calculate gradients of the same population objective.
We show that our method can reduce the number, and frequency, of required communication rounds compared to existing methods without hurting performance.
arXiv Detail & Related papers (2021-10-07T17:51:10Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Nearest Neighbour Based Estimates of Gradients: Sharp Nonasymptotic
Bounds and Applications [0.6445605125467573]
gradient estimation is of crucial importance in statistics and learning theory.
We consider here the classic regression setup, where a real valued square integrable r.v. $Y$ is to be predicted.
We prove nonasymptotic bounds improving upon those obtained for alternative estimation methods.
arXiv Detail & Related papers (2020-06-26T15:19:43Z) - Asymptotic Analysis of Sampling Estimators for Randomized Numerical
Linear Algebra Algorithms [43.134933182911766]
We develop an analysis to derive the distribution of RandNLA sampling estimators for the least-squares problem.
We identify optimal sampling probabilities based on the Asymptotic Mean Squared Error (AMSE) and the Expected Asymptotic Mean Squared Error (EAMSE)
Our theoretical results clarify the role of leverage in the sampling process, and our empirical results demonstrate improvements over existing methods.
arXiv Detail & Related papers (2020-02-24T20:34:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.