Related papers: $k$FW: A Frank-Wolfe style algorithm with stronger subproblem oracles

$k$FW: A Frank-Wolfe style algorithm with stronger subproblem oracles

URL: http://arxiv.org/abs/2006.16142v2
Date: Mon, 15 Nov 2021 19:34:49 GMT
Title: $k$FW: A Frank-Wolfe style algorithm with stronger subproblem oracles
Authors: Lijun Ding, Jicong Fan, and Madeleine Udell
Abstract summary: This paper proposes a new variant of Frank-Wolfe (FW) called $k$FW. Standard FW suffers from slow convergence: iterates often zig-zag as update directions oscillate around extreme points of a constraint set. The new variant, $k$FW, overcomes this problem by using two stronger subproblem oracles in each iteration.
Score: 34.53447188447357
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes a new variant of Frank-Wolfe (FW), called $k$FW. Standard FW suffers from slow convergence: iterates often zig-zag as update directions oscillate around extreme points of the constraint set. The new variant, $k$FW, overcomes this problem by using two stronger subproblem oracles in each iteration. The first is a $k$ linear optimization oracle ($k$LOO) that computes the $k$ best update directions (rather than just one). The second is a $k$ direction search ($k$DS) that minimizes the objective over a constraint set represented by the $k$ best update directions and the previous iterate. When the problem solution admits a sparse representation, both oracles are easy to compute, and $k$FW converges quickly for smooth convex objectives and several interesting constraint sets: $k$FW achieves finite $\frac{4L_f^3D^4}{\gamma\delta^2}$ convergence on polytopes and group norm balls, and linear convergence on spectrahedra and nuclear norm balls. Numerical experiments validate the effectiveness of $k$FW and demonstrate an order-of-magnitude speedup over existing approaches.

Related papers

Accelerated Variance-Reduced Forward-Reflected Methods for Root-Finding Problems [8.0153031008486]
We propose a novel class of Nesterov's accelerated forward-reflected-based methods with variance reduction to solve root-finding problems. Our algorithm is single-loop and leverages a new family of unbiased variance-reduced estimators specifically designed for root-finding problems.
arXiv Detail & Related papers (2024-06-04T15:23:29Z)
On the Complexity of First-Order Methods in Stochastic Bilevel Optimization [9.649991673557167]
We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex. Existing approaches tie their analyses to a genie algorithm that knows lower-level solutions and, therefore, need not query any points far from them. We propose a simple first-order method that converges to an $epsilon$ stationary point using $O(epsilon-6), O(epsilon-4)$ access to first-order $y*$-aware oracles.
arXiv Detail & Related papers (2024-02-11T04:26:35Z)
Extending the Design Space of Graph Neural Networks by Rethinking Folklore Weisfeiler-Lehman [66.23316415757456]
Message passing neural networks (MPNNs) have emerged as the most popular framework of graph neural networks (GNNs) in recent years. However, their expressive power is limited by the 1-dimensional Weisfeiler-Lehman (1-WL) test. We propose an extension, $(k,t)$-FWL, which considers any equivariant set as neighbors instead of all nodes. N$2$-GNN achieves record-breaking results on ZINC-Subset (0.059), outperforming previous SOTA results by 10.6%.
arXiv Detail & Related papers (2023-06-05T21:35:32Z)
Improved Dynamic Regret for Online Frank-Wolfe [54.690867216880356]
We investigate the dynamic regret of online Frank-Wolfe (OFW), which is an efficient projection-free algorithm for online convex optimization. In this paper, we derive improved dynamic regret bounds for OFW by extending the fast convergence rates of FW from offline optimization to online optimization.
arXiv Detail & Related papers (2023-02-11T07:19:51Z)
Extra-Newton: A First Approach to Noise-Adaptive Accelerated Second-Order Methods [57.050204432302195]
This work proposes a universal and adaptive second-order method for minimizing second-order smooth, convex functions. Our algorithm achieves $O(sigma / sqrtT)$ convergence when the oracle feedback is with variance $sigma2$, and improves its convergence to $O( 1 / T3)$ with deterministic oracles.
arXiv Detail & Related papers (2022-11-03T14:12:51Z)
Fast Stochastic Composite Minimization and an Accelerated Frank-Wolfe Algorithm under Parallelization [7.197233473373693]
We consider the problem of minimizing the sum of two convex functions. One has Lipschitz-continuous gradients, and can be accessed via oracles, whereas the other is "simple" We show that one can achieve an $epsilon$ primaldual gap (in expectation) in $tildeO (1/ sqrtepsilon)$ iterations.
arXiv Detail & Related papers (2022-05-25T13:01:09Z)
A first-order primal-dual method with adaptivity to local smoothness [64.62056765216386]
We consider the problem of finding a saddle point for the convex-concave objective $min_x max_y f(x) + langle Ax, yrangle - g*(y)$, where $f$ is a convex function with locally Lipschitz gradient and $g$ is convex and possibly non-smooth. We propose an adaptive version of the Condat-Vu algorithm, which alternates between primal gradient steps and dual steps.
arXiv Detail & Related papers (2021-10-28T14:19:30Z)
Approximate Frank-Wolfe Algorithms over Graph-structured Support Sets [27.913569257545554]
Two popular approximation assumptions (textitadditive and textitmultiplicative gap errors) are not valid for our problem. A new textitapproximate dual oracle (DMO) is proposed, which approximates the inner product rather than the gap. Our empirical results suggest that even these improved bounds are pessimistic, with significant improvement in real-world images with graph-structured sparsity.
arXiv Detail & Related papers (2021-06-29T19:39:43Z)
Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method [54.93433440034386]
We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls. Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
arXiv Detail & Related papers (2020-10-05T08:16:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.