Improved SVRG for quadratic functions
- URL: http://arxiv.org/abs/2006.01017v2
- Date: Tue, 15 Jun 2021 07:36:41 GMT
- Title: Improved SVRG for quadratic functions
- Authors: Nabil Kahale
- Abstract summary: We analyse an iterative algorithm to minimize quadratic functions whose Hessian matrix $H$ is the expectation of a random symmetric $dtimes d$ matrix.
In several applications, including least-squares regressions, ridge regressions, linear discriminant analysis and regularized linear discriminant analysis, the running time of each iteration is proportional to $d$.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We analyse an iterative algorithm to minimize quadratic functions whose
Hessian matrix $H$ is the expectation of a random symmetric $d\times d$ matrix.
The algorithm is a variant of the stochastic variance reduced gradient (SVRG).
In several applications, including least-squares regressions, ridge
regressions, linear discriminant analysis and regularized linear discriminant
analysis, the running time of each iteration is proportional to $d$. Under
smoothness and convexity conditions, the algorithm has linear convergence. When
applied to quadratic functions, our analysis improves the state-of-the-art
performance of SVRG up to a logarithmic factor. Furthermore, for
well-conditioned quadratic problems, our analysis improves the state-of-the-art
running times of accelerated SVRG, and is better than the known matching lower
bound, by a logarithmic factor. Our theoretical results are backed with
numerical experiments.
Related papers
- Optimal Rates for Vector-Valued Spectral Regularization Learning Algorithms [28.046728466038022]
We study theoretical properties of a broad class of regularized algorithms with vector-valued output.
We rigorously confirm the so-called saturation effect for ridge regression with vector-valued output.
We present the upper bound for the finite sample risk general vector-valued spectral algorithms.
arXiv Detail & Related papers (2024-05-23T16:45:52Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Convergence analysis of online algorithms for vector-valued kernel regression [0.42970700836450487]
We consider the problem of approximating the regression function from noisy vector-valued data by an online learning algorithm.
We show that the expected squared error in the RKHS norm can be bounded by $C2 (m+1)-s/(2+s)$, where $m$ is the current number of processed data.
arXiv Detail & Related papers (2023-09-14T15:10:47Z) - Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming [53.63469275932989]
We consider online statistical inference of constrained nonlinear optimization problems.
We apply the Sequential Quadratic Programming (StoSQP) method to solve these problems.
arXiv Detail & Related papers (2022-05-27T00:34:03Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Sample Complexity Bounds for Two Timescale Value-based Reinforcement
Learning Algorithms [65.09383385484007]
Two timescale approximation (SA) has been widely used in value-based reinforcement learning algorithms.
We study the non-asymptotic convergence rate of two timescale linear and nonlinear TDC and Greedy-GQ algorithms.
arXiv Detail & Related papers (2020-11-10T11:36:30Z) - Provably Convergent Working Set Algorithm for Non-Convex Regularized
Regression [0.0]
This paper proposes a working set algorithm for non-regular regularizers with convergence guarantees.
Our results demonstrate high gain compared to the full problem solver for both block-coordinates or a gradient solver.
arXiv Detail & Related papers (2020-06-24T07:40:31Z) - Gradient Free Minimax Optimization: Variance Reduction and Faster
Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem.
We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z) - Efficient algorithms for multivariate shape-constrained convex
regression problems [9.281671380673306]
We prove that the least squares estimator is computable via solving a constrained convex programming (QP) problem with $(n+1)d$ variables and at least $n(n-1)$ linear inequality constraints.
For solving the generally very large-scale convex QP, we design two efficient algorithms, one is the symmetric Gauss-Seidel based alternating direction method of multipliers (tt sGS-ADMM), and the other is the proximal augmented Lagrangian method (tt pALM) with the subproblems solved by the semismooth Newton method (t
arXiv Detail & Related papers (2020-02-26T11:18:43Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.