Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates
and Practical Features
- URL: http://arxiv.org/abs/2304.11737v1
- Date: Sun, 23 Apr 2023 20:05:09 GMT
- Title: Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates
and Practical Features
- Authors: Aleksandr Beznosikov, David Dobre, Gauthier Gidel
- Abstract summary: The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints.
We present two new variants of the algorithms for minimization of the finite-sum gradient.
- Score: 79.39965431772626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Frank-Wolfe (FW) method is a popular approach for solving optimization
problems with structured constraints that arise in machine learning
applications. In recent years, stochastic versions of FW have gained
popularity, motivated by large datasets for which the computation of the full
gradient is prohibitively expensive. In this paper, we present two new variants
of the FW algorithms for stochastic finite-sum minimization. Our algorithms
have the best convergence guarantees of existing stochastic FW approaches for
both convex and non-convex objective functions. Our methods do not have the
issue of permanently collecting large batches, which is common to many
stochastic projection-free approaches. Moreover, our second approach does not
require either large batches or full deterministic gradients, which is a
typical weakness of many techniques for finite-sum problems. The faster
theoretical rates of our approaches are confirmed experimentally.
Related papers
- Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method
for Empirical Risk Minimization [1.4504054468850665]
In Empirical Minimization -- Minimization -- we present a novel computational step-size approach for which we have computational guarantees.
We show that our methods exhibit very significant problems on realworld binary datasets.
We also present a novel adaptive step-size approach for which we have computational guarantees.
arXiv Detail & Related papers (2022-08-30T00:08:37Z) - STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions.
Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z) - Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained
Optimization [13.170519806372075]
Problems of convex optimization with two sets of constraints arise frequently in the context of semidefinite programming.
Since projection onto the first set of constraints is difficult, it becomes necessary to explore projection-free algorithms.
The efficacy of the proposed algorithms is tested on relevant applications of sparse matrix estimation, clustering via semidefinite relaxation, and uniform sparsest cut problem.
arXiv Detail & Related papers (2021-07-14T08:01:30Z) - Scalable Projection-Free Optimization [7.170441928038049]
As a projection-free algorithm, Frank-Wolfe (FW) has recently received considerable attention in the machine learning community.
We first propose a scalable projection-free optimization method that requires only one sample per iteration.
We then develop a distributed Frank-Wolfe (FW) framework for both convex and non objective functions.
arXiv Detail & Related papers (2021-05-07T22:27:18Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Fast Objective & Duality Gap Convergence for Non-Convex Strongly-Concave
Min-Max Problems with PL Condition [52.08417569774822]
This paper focuses on methods for solving smooth non-concave min-max problems, which have received increasing attention due to deep learning (e.g., deep AUC)
arXiv Detail & Related papers (2020-06-12T00:32:21Z) - Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization [30.980431848476925]
We propose an algorithm for constrained finite smooth-sum minimization with a generalized linear prediction/structure.
The proposed method is simple to implement, does not require step-size tuning, and has a constant per-iteration independent of the dataset size.
We provide an implementation of all considered methods in an open-source package.
arXiv Detail & Related papers (2020-02-27T00:47:21Z) - Self-Concordant Analysis of Frank-Wolfe Algorithms [3.3598755777055374]
In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature.
We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations.
arXiv Detail & Related papers (2020-02-11T11:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.