Faster Rates for the Frank-Wolfe Algorithm Using Jacobi Polynomials
- URL: http://arxiv.org/abs/2110.09738v1
- Date: Tue, 19 Oct 2021 05:11:23 GMT
- Title: Faster Rates for the Frank-Wolfe Algorithm Using Jacobi Polynomials
- Authors: Robin Francis and Sundeep Prabhakar Chepuri
- Abstract summary: The Frank Wolfe algorithm (FW) is a popular projection-free alternative for solving large-scale constrained optimization problems.
FW suffers from a sublinear convergence rate when minimizing a smooth convex function over a compact convex set.
- Score: 14.112444998191698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Frank Wolfe algorithm (FW) is a popular projection-free alternative for
solving large-scale constrained optimization problems. However, the FW
algorithm suffers from a sublinear convergence rate when minimizing a smooth
convex function over a compact convex set. Thus, exploring techniques that
yield a faster convergence rate becomes crucial. A classic approach to obtain
faster rates is to combine previous iterates to obtain the next iterate. In
this work, we extend this approach to the FW setting and show that the optimal
way to combine the past iterates is using a set of orthogonal Jacobi
polynomials. We also a polynomial-based acceleration technique, referred to as
Jacobi polynomial accelerated FW, which combines the current iterate with the
past iterate using combing weights related to the Jacobi recursion. By
carefully choosing parameters of the Jacobi polynomials, we obtain a faster
sublinear convergence rate. We provide numerical experiments on real datasets
to demonstrate the efficacy of the proposed algorithm.
Related papers
- Closing the Computational-Query Depth Gap in Parallel Stochastic Convex Optimization [26.36906884097317]
We develop a new parallel algorithm for minimizing Lipschitz, convex functions with a subgradient oracle.
Our result closes a gap between the best-known query depth and the best-known computational depth of parallel algorithms.
arXiv Detail & Related papers (2024-06-11T15:41:48Z) - An Efficient Algorithm for Clustered Multi-Task Compressive Sensing [60.70532293880842]
Clustered multi-task compressive sensing is a hierarchical model that solves multiple compressive sensing tasks.
The existing inference algorithm for this model is computationally expensive and does not scale well in high dimensions.
We propose a new algorithm that substantially accelerates model inference by avoiding the need to explicitly compute these covariance matrices.
arXiv Detail & Related papers (2023-09-30T15:57:14Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Ordering for Non-Replacement SGD [7.11967773739707]
We seek to find an ordering that can improve the convergence rates for the non-replacement form of the algorithm.
We develop optimal orderings for constant and decreasing step sizes for strongly convex and convex functions.
In addition, we are able to combine the ordering with mini-batch and further apply it to more complex neural networks.
arXiv Detail & Related papers (2023-06-28T00:46:58Z) - Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates
and Practical Features [79.39965431772626]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints.
We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z) - Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - The Curse of Unrolling: Rate of Differentiating Through Optimization [35.327233435055305]
Un differentiation approximates the solution using an iterative solver and differentiates it through the computational path.
We show that we can either 1) choose a large learning rate leading to a fast convergence but accept that the algorithm may have an arbitrarily long burn-in phase or 2) choose a smaller learning rate leading to an immediate but slower convergence.
arXiv Detail & Related papers (2022-09-27T09:27:29Z) - SHINE: SHaring the INverse Estimate from the forward pass for bi-level
optimization and implicit models [15.541264326378366]
In recent years, implicit deep learning has emerged as a method to increase the depth of deep neural networks.
The training is performed as a bi-level problem, and its computational complexity is partially driven by the iterative inversion of a huge Jacobian matrix.
We propose a novel strategy to tackle this computational bottleneck from which many bi-level problems suffer.
arXiv Detail & Related papers (2021-06-01T15:07:34Z) - Combinatorial Black-Box Optimization with Expert Advice [28.45580493454314]
We consider the problem of black-box function optimization over the hypercube.
We propose a computationally efficient model learning algorithm based on multilinears and exponential weight updates.
arXiv Detail & Related papers (2020-06-06T20:26:19Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.