$k$FW: A Frank-Wolfe style algorithm with stronger subproblem oracles
- URL: http://arxiv.org/abs/2006.16142v2
- Date: Mon, 15 Nov 2021 19:34:49 GMT
- Title: $k$FW: A Frank-Wolfe style algorithm with stronger subproblem oracles
- Authors: Lijun Ding, Jicong Fan, and Madeleine Udell
- Abstract summary: This paper proposes a new variant of Frank-Wolfe (FW) called $k$FW.
Standard FW suffers from slow convergence: iterates often zig-zag as update directions oscillate around extreme points of a constraint set.
The new variant, $k$FW, overcomes this problem by using two stronger subproblem oracles in each iteration.
- Score: 34.53447188447357
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes a new variant of Frank-Wolfe (FW), called $k$FW. Standard
FW suffers from slow convergence: iterates often zig-zag as update directions
oscillate around extreme points of the constraint set. The new variant, $k$FW,
overcomes this problem by using two stronger subproblem oracles in each
iteration. The first is a $k$ linear optimization oracle ($k$LOO) that computes
the $k$ best update directions (rather than just one). The second is a $k$
direction search ($k$DS) that minimizes the objective over a constraint set
represented by the $k$ best update directions and the previous iterate. When
the problem solution admits a sparse representation, both oracles are easy to
compute, and $k$FW converges quickly for smooth convex objectives and several
interesting constraint sets: $k$FW achieves finite
$\frac{4L_f^3D^4}{\gamma\delta^2}$ convergence on polytopes and group norm
balls, and linear convergence on spectrahedra and nuclear norm balls. Numerical
experiments validate the effectiveness of $k$FW and demonstrate an
order-of-magnitude speedup over existing approaches.
Related papers
- Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems.
Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z) - Extending the Design Space of Graph Neural Networks by Rethinking
Folklore Weisfeiler-Lehman [66.23316415757456]
Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years.
However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test.
We propose an extension, $(k,t)$-FWL, which considers any equivariant set as neighbors instead of all nodes.
N$2$-GNN achieves record-breaking results on ZINC-Subset (0.059), outperforming previous SOTA results by 10.6%.
arXiv Detail & Related papers (2023-06-05T21:35:32Z) - Improved Dynamic Regret for Online Frank-Wolfe [54.690867216880356]
We investigate the dynamic regret of online Frank-Wolfe (OFW), which is an efficient projection-free algorithm for online convex optimization.
In this paper, we derive improved dynamic regret bounds for OFW by extending the fast convergence rates of FW from offline optimization to online optimization.
arXiv Detail & Related papers (2023-02-11T07:19:51Z) - Extra-Newton: A First Approach to Noise-Adaptive Accelerated
Second-Order Methods [57.050204432302195]
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions.
Our algorithm achieves $O(sigma / sqrtT)$ convergence when the oracle feedback is with variance $sigma2$, and improves its convergence to $O( 1 / T3)$ with deterministic oracles.
arXiv Detail & Related papers (2022-11-03T14:12:51Z) - Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization [7.197233473373693]
We consider the problem of minimizing the sum of two convex functions.
One has Lipschitz-continuous gradients, and can be accessed via oracles, whereas the other is "simple"
We show that one can achieve an $epsilon$ primaldual gap (in expectation) in $tildeO (1/ sqrtepsilon)$ iterations.
arXiv Detail & Related papers (2022-05-25T13:01:09Z) - A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth.
We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z) - Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets [27.913569257545554]
Two popular approximation assumptions (textitadditive and textitmultiplicative gap errors) are not valid for our problem.
A new textitapproximate dual oracle (DMO) is proposed, which approximates the inner product rather than the gap.
Our empirical results suggest that even these improved bounds are pessimistic, with significant improvement in real-world images with graph-structured sparsity.
arXiv Detail & Related papers (2021-06-29T19:39:43Z) - Projection Efficient Subgradient Method and Optimal Nonsmooth
Frank-Wolfe Method [54.93433440034386]
We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls.
Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
arXiv Detail & Related papers (2020-10-05T08:16:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.