An Adaptive Incremental Gradient Method With Support for Non-Euclidean
Norms
- URL: http://arxiv.org/abs/2205.02273v1
- Date: Thu, 28 Apr 2022 09:43:07 GMT
- Title: An Adaptive Incremental Gradient Method With Support for Non-Euclidean
Norms
- Authors: Binghui Xie, Chenhan Jin, Kaiwen Zhou, James Cheng, Wei Meng
- Abstract summary: We propose and analyze several novel adaptive variants of the popular SAGA algorithm.
We establish its convergence guarantees under general settings.
We improve the analysis of SAGA to support non-Euclidean norms.
- Score: 19.41328109094503
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic variance reduced methods have shown strong performance in solving
finite-sum problems. However, these methods usually require the users to
manually tune the step-size, which is time-consuming or even infeasible for
some large-scale optimization tasks. To overcome the problem, we propose and
analyze several novel adaptive variants of the popular SAGA algorithm.
Eventually, we design a variant of Barzilai-Borwein step-size which is tailored
for the incremental gradient method to ensure memory efficiency and fast
convergence. We establish its convergence guarantees under general settings
that allow non-Euclidean norms in the definition of smoothness and the
composite objectives, which cover a broad range of applications in machine
learning. We improve the analysis of SAGA to support non-Euclidean norms, which
fills the void of existing work. Numerical experiments on standard datasets
demonstrate a competitive performance of the proposed algorithm compared with
existing variance-reduced methods and their adaptive variants.
Related papers
- Adaptive multi-gradient methods for quasiconvex vector optimization and
applications to multi-task learning [1.03590082373586]
We present an adaptive step-size method, which does not include line-search techniques, for solving a wide class of nonobjective multi-size programming problems.
We prove an unbounded convergence set on modest assumptions.
We apply the proposed technique to some multi-task experiments to show efficacy for largescale challenges.
arXiv Detail & Related papers (2024-02-09T07:20:14Z) - Adaptive Stochastic Optimisation of Nonconvex Composite Objectives [2.1700203922407493]
We propose and analyse a family of generalised composite mirror descent algorithms.
With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem.
We exploit the low-dimensional structure of the decision sets for high-dimensional problems.
arXiv Detail & Related papers (2022-11-21T18:31:43Z) - An Accelerated Doubly Stochastic Gradient Method with Faster Explicit
Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems.
We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z) - Adaptive Zeroth-Order Optimisation of Nonconvex Composite Objectives [1.7640556247739623]
We analyze algorithms for zeroth-order entropy composite objectives, focusing on dependence on dimensionality.
This is achieved by exploiting low dimensional structure of the decision set using the mirror descent method with an estimation alike function.
To improve the gradient, we replace the classic sampling method based on Rademacher and show that the mini-batch method copes with non-Eucli geometry.
arXiv Detail & Related papers (2022-08-09T07:36:25Z) - SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients [99.13839450032408]
It is desired to design a universal framework for adaptive algorithms to solve general problems.
In particular, our novel framework provides adaptive methods under non convergence support for setting.
arXiv Detail & Related papers (2021-06-15T15:16:28Z) - AI-SARAH: Adaptive and Implicit Stochastic Recursive Gradient Methods [7.486132958737807]
We present an adaptive variance reduced method with an implicit approach for adaptivity.
We provide convergence guarantees for finite-sum minimization problems and show a faster convergence than SARAH can be achieved if local geometry permits.
This algorithm implicitly computes step-size and efficiently estimates local Lipschitz smoothness of functions.
arXiv Detail & Related papers (2021-02-19T01:17:15Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - The Strength of Nesterov's Extrapolation in the Individual Convergence
of Nonsmooth Optimization [0.0]
We prove that Nesterov's extrapolation has the strength to make the individual convergence of gradient descent methods optimal for nonsmooth problems.
We give an extension of the derived algorithms to solve regularized learning tasks with nonsmooth losses in settings.
Our method is applicable as an efficient tool for solving large-scale $l$1-regularized hinge-loss learning problems.
arXiv Detail & Related papers (2020-06-08T03:35:41Z) - Adaptivity of Stochastic Gradient Methods for Nonconvex Optimization [71.03797261151605]
Adaptivity is an important yet under-studied property in modern optimization theory.
Our algorithm is proved to achieve the best-available convergence for non-PL objectives simultaneously while outperforming existing algorithms for PL objectives.
arXiv Detail & Related papers (2020-02-13T05:42:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.