Scalable Projection-Free Optimization
- URL: http://arxiv.org/abs/2105.03527v1
- Date: Fri, 7 May 2021 22:27:18 GMT
- Title: Scalable Projection-Free Optimization
- Authors: Mingrui Zhang
- Abstract summary: As a projection-free algorithm, Frank-Wolfe (FW) has recently received considerable attention in the machine learning community.
We first propose a scalable projection-free optimization method that requires only one sample per iteration.
We then develop a distributed Frank-Wolfe (FW) framework for both convex and non objective functions.
- Score: 7.170441928038049
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a projection-free algorithm, Frank-Wolfe (FW) method, also known as
conditional gradient, has recently received considerable attention in the
machine learning community. In this dissertation, we study several topics on
the FW variants for scalable projection-free optimization.
We first propose 1-SFW, the first projection-free method that requires only
one sample per iteration to update the optimization variable and yet achieves
the best known complexity bounds for convex, non-convex, and monotone
DR-submodular settings. Then we move forward to the distributed setting, and
develop Quantized Frank-Wolfe (QFW), a general communication-efficient
distributed FW framework for both convex and non-convex objective functions. We
study the performance of QFW in two widely recognized settings: 1) stochastic
optimization and 2) finite-sum optimization. Finally, we propose Black-Box
Continuous Greedy, a derivative-free and projection-free algorithm, that
maximizes a monotone continuous DR-submodular function over a bounded convex
body in Euclidean space.
Related papers
- Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries.
We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z) - Covariance-Adaptive Sequential Black-box Optimization for Diffusion Targeted Generation [60.41803046775034]
We show how to perform user-preferred targeted generation via diffusion models with only black-box target scores of users.
Experiments on both numerical test problems and target-guided 3D-molecule generation tasks show the superior performance of our method in achieving better target scores.
arXiv Detail & Related papers (2024-06-02T17:26:27Z) - Double Duality: Variational Primal-Dual Policy Optimization for
Constrained Reinforcement Learning [132.7040981721302]
We study the Constrained Convex Decision Process (MDP), where the goal is to minimize a convex functional of the visitation measure.
Design algorithms for a constrained convex MDP faces several challenges, including handling the large state space.
arXiv Detail & Related papers (2024-02-16T16:35:18Z) - Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features [65.64276393443346]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints.
We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Minibatch and Momentum Model-based Methods for Stochastic Non-smooth
Non-convex Optimization [3.4809730725241597]
We make two important extensions to model-based methods.
First, we propose a new minibatch which takes a set of samples to approximate the model function in each iteration.
Second, by the success of momentum techniques we propose a new convex-based model.
arXiv Detail & Related papers (2021-06-06T05:31:57Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Projection-Free Adaptive Gradients for Large-Scale Optimization [22.0439695290991]
Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens by querying only approximate first-order information from the objective.
We show that our method can improve the performance of adaptive algorithms for constrained optimization.
arXiv Detail & Related papers (2020-09-29T15:56:12Z) - Self-Concordant Analysis of Frank-Wolfe Algorithms [3.3598755777055374]
In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature.
We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations.
arXiv Detail & Related papers (2020-02-11T11:30:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.