How Does Momentum Help Frank Wolfe?
- URL: http://arxiv.org/abs/2006.11116v1
- Date: Fri, 19 Jun 2020 13:06:00 GMT
- Title: How Does Momentum Help Frank Wolfe?
- Authors: Bingcong Li, Mario Coutino, Georgios B. Giannakis, Geert Leus
- Abstract summary: We unveil the connections between Frank Wolfe (FW) type algorithms and the momentum in Accelerated Gradient Methods (AGM)
On the negative side, these connections illustrate why momentum is unlikely to be effective for FW type algorithms.
The encouraging message behind this link, on the other hand, is that momentum is useful for FW on a class of problems.
- Score: 81.95857252228537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We unveil the connections between Frank Wolfe (FW) type algorithms and the
momentum in Accelerated Gradient Methods (AGM). On the negative side, these
connections illustrate why momentum is unlikely to be effective for FW type
algorithms. The encouraging message behind this link, on the other hand, is
that momentum is useful for FW on a class of problems. In particular, we prove
that a momentum variant of FW, that we term accelerated Frank Wolfe (AFW),
converges with a faster rate $\tilde{\cal O}(\frac{1}{k^2})$ on certain
constraint sets despite the same ${\cal O}(\frac{1}{k})$ rate as FW on general
cases. Given the possible acceleration of AFW at almost no extra cost, it is
thus a competitive alternative to FW. Numerical experiments on benchmarked
machine learning tasks further validate our theoretical findings.
Related papers
- Federated Frank-Wolfe Algorithm [7.124736158080938]
We propose a Federated FrankWolfe Algorithm (FedFW) for constrained machine learning problems.
FedFW features data privacy, low per-it cost, sparse communication, and iterations.
We show that FedFW finds a solution within $O(varepsilon-3)$ in the convex setting.
arXiv Detail & Related papers (2024-08-19T15:31:06Z) - Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features [65.64276393443346]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints.
We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z) - Efficient $\Phi$-Regret Minimization in Extensive-Form Games via Online
Mirror Descent [49.93548413166884]
$Phi$-Hedge is a generic algorithm capable of learning a large class of equilibria for Normal-Form Games (NFGs)
We show that $Phi$-Hedge can be directly used to learn Nash Equilibria (zero-sum settings), Normal-Form Coarse Correlated Equilibria (NFCCE), and Extensive-Form Correlated Equilibria (EFCE) in EFGs.
We prove that, in those settings, the emph$Phi$-Hedge algorithms are equivalent to standard Mirror Descent (OMD) algorithms for
arXiv Detail & Related papers (2022-05-30T17:58:06Z) - Heavy Ball Momentum for Conditional Gradient [67.58002120548566]
Conditional gradient, aka Frank Wolfe (FW) algorithms have well-documented merits in machine learning and signal processing applications.
This work deals with heavy ball momentum, and its impact to FW.
Numerical results demonstrate the usefulness of heavy ball momentum in FW.
arXiv Detail & Related papers (2021-10-08T16:50:00Z) - Enhancing Parameter-Free Frank Wolfe with an Extra Subproblem [75.83472151564185]
This work introduces and analyzes a variant of the Frank Wolfe (FW) algorithm termed ExtraFW.
ExtraFW is the pair of gradients leveraged per iteration, thanks to which the decision variable is updated in a prediction-correction (PC) format.
Compared with other parameter-free FW variants that have faster rates on the same problems, ExtraFW has improved rates and fine-grained analysis thanks to its PC update.
arXiv Detail & Related papers (2020-12-09T19:47:24Z) - Accelerated Stochastic Gradient-free and Projection-free Methods [37.15461647823691]
We propose an accelerated zeroth-order Frank-Wolfe (Acc-SZOFW) based on a new reduced variance technique of STORM.
To relax the large batches required in the Acc-SZOFW, we further propose a novel accelerated zeroth-order Frank-Wolfe (Acc-SZOFW*) based on a new reduced variance technique of STORM.
arXiv Detail & Related papers (2020-07-16T20:50:15Z) - $k$FW: A Frank-Wolfe style algorithm with stronger subproblem oracles [34.53447188447357]
This paper proposes a new variant of Frank-Wolfe (FW) called $k$FW.
Standard FW suffers from slow convergence: iterates often zig-zag as update directions oscillate around extreme points of a constraint set.
The new variant, $k$FW, overcomes this problem by using two stronger subproblem oracles in each iteration.
arXiv Detail & Related papers (2020-06-29T16:02:48Z) - A Newton Frank-Wolfe Method for Constrained Self-Concordant Minimization [60.90222082871258]
We demonstrate how to scalably solve a class of constrained self-concordant minimization problems using linear minimization oracles (LMO) over the constraint set.
We prove that the number of LMO calls of our method is nearly the same as that of the Frank-Wolfe method in the L-smooth case.
arXiv Detail & Related papers (2020-02-17T15:28:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.