Accelerated nonlinear primal-dual hybrid gradient algorithms with
applications to machine learning
- URL: http://arxiv.org/abs/2109.12222v1
- Date: Fri, 24 Sep 2021 22:37:10 GMT
- Title: Accelerated nonlinear primal-dual hybrid gradient algorithms with
applications to machine learning
- Authors: J\'er\^ome Darbon and Gabriel Provencher Langlois
- Abstract summary: primal-dual hybrid gradient (PDHG) is a first-order method that splits convex optimization problems with saddle-point structure into smaller subproblems.
PDHG requires stepsize parameters fine-tuned for the problem at hand.
We introduce accelerated nonlinear variants of the PDHG algorithm that can achieve, for a broad class of optimization problems relevant to machine learning.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The primal-dual hybrid gradient (PDHG) algorithm is a first-order method that
splits convex optimization problems with saddle-point structure into smaller
subproblems. Those subproblems, unlike those obtained from most other splitting
methods, can generally be solved efficiently because they involve simple
operations such as matrix-vector multiplications or proximal mappings that are
easy to evaluate. In order to work fast, however, the PDHG algorithm requires
stepsize parameters fine-tuned for the problem at hand. Unfortunately, the
stepsize parameters must often be estimated from quantities that are
prohibitively expensive to compute for large-scale optimization problems, such
as those in machine learning. In this paper, we introduce accelerated nonlinear
variants of the PDHG algorithm that can achieve, for a broad class of
optimization problems relevant to machine learning, an optimal rate of
convergence with stepsize parameters that are simple to compute. We prove
rigorous convergence results, including for problems posed on
infinite-dimensional reflexive Banach spaces. We also provide practical
implementations of accelerated nonlinear PDHG algorithms for solving several
regression tasks in machine learning, including support vector machines without
offset, kernel ridge regression, elastic net regularized linear regression, and
the least absolute shrinkage selection operator.
Related papers
- Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - A simple uniformly optimal method without line search for convex optimization [9.280355951055865]
We show that line search is superfluous in attaining the optimal rate of convergence for solving a convex optimization problem whose parameters are not given a priori.
We present a novel accelerated gradient descent type algorithm called AC-FGM that can achieve an optimal $mathcalO (1/k2)$ rate of convergence for smooth convex optimization.
arXiv Detail & Related papers (2023-10-16T05:26:03Z) - Fast Screening Rules for Optimal Design via Quadratic Lasso
Reformulation [0.135975510645475]
In this work, we derive safe screening rules that can be used to discard inessential samples.
The new tests are much faster to compute, especially for problems involving a parameter space of high dimension.
We show how an existing homotopy algorithm to compute the regularization path of the lasso method can be reparametrized with respect to the squared $ell_$-penalty.
arXiv Detail & Related papers (2023-10-13T08:10:46Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Constrained Optimization via Exact Augmented Lagrangian and Randomized
Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems.
We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z) - Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints [9.301728976515255]
This article provides new practical and theoretical developments for the landing algorithm.
First, the method is extended to the Stiefel manifold.
We also consider variance reduction algorithms when the cost function is an average of many functions.
arXiv Detail & Related papers (2023-03-29T07:36:54Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Continuation Path with Linear Convergence Rate [18.405645120971496]
Path-following algorithms are frequently used in composite optimization problems where a series of subproblems are solved sequentially.
We present a primal dual analysis of the path-following algorithm as well as determining how accurately each subproblem should be solved to guarantee a linear convergence rate on a target problem.
Considering optimization with a sparsity-inducing penalty, we analyze the change of the active sets with respect to the regularization parameter.
arXiv Detail & Related papers (2021-12-09T18:42:13Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Solution Path Algorithm for Twin Multi-class Support Vector Machine [6.97711662470035]
The paper is devoted to the fast regularization parameter tuning algorithm for the twin multi-class support vector machine.
A new sample dataset division method is adopted and the Lagrangian multipliers are proved to be piecewise linear.
The proposed method can achieve good classification performance with reducing the computational cost of grid search method from exponential level to the constant level.
arXiv Detail & Related papers (2020-05-30T14:05:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.