Related papers: An Enhanced Zeroth-Order Stochastic Frank-Wolfe Framework for Constrained Finite-Sum Optimization

An Enhanced Zeroth-Order Stochastic Frank-Wolfe Framework for Constrained Finite-Sum Optimization

URL: http://arxiv.org/abs/2501.07201v2
Date: Thu, 23 Jan 2025 03:47:19 GMT
Title: An Enhanced Zeroth-Order Stochastic Frank-Wolfe Framework for Constrained Finite-Sum Optimization
Authors: Haishan Ye, Yinghui Huang, Hao Di, Xiangyu Chang,
Abstract summary: We propose an enhanced zeroth-order convex computation Frank-Wolfe to address constrained finite-sum optimization problems.<n>Our method introduces a novel double variance reduction framework that effectively reduces the approximation induced by zeroth-order oracles.
Score: 15.652261277429968
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose an enhanced zeroth-order stochastic Frank-Wolfe framework to address constrained finite-sum optimization problems, a structure prevalent in large-scale machine-learning applications. Our method introduces a novel double variance reduction framework that effectively reduces the gradient approximation variance induced by zeroth-order oracles and the stochastic sampling variance from finite-sum objectives. By leveraging this framework, our algorithm achieves significant improvements in query efficiency, making it particularly well-suited for high-dimensional optimization tasks. Specifically, for convex objectives, the algorithm achieves a query complexity of O(d \sqrt{n}/\epsilon ) to find an epsilon-suboptimal solution, where d is the dimensionality and n is the number of functions in the finite-sum objective. For non-convex objectives, it achieves a query complexity of O(d^{3/2}\sqrt{n}/\epsilon^2 ) without requiring the computation ofd partial derivatives at each iteration. These complexities are the best known among zeroth-order stochastic Frank-Wolfe algorithms that avoid explicit gradient calculations. Empirical experiments on convex and non-convex machine learning tasks, including sparse logistic regression, robust classification, and adversarial attacks on deep networks, validate the computational efficiency and scalability of our approach. Our algorithm demonstrates superior performance in both convergence rate and query complexity compared to existing methods.

Related papers

Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms. An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Versatile Single-Loop Method for Gradient Estimator: First and Second Order Optimality, and its Application to Federated Learning [45.78238792836363]
We present a single-loop algorithm named SLEDGE (Single-Loop-E Gradient Estimator) for periodic convergence. Unlike existing methods, SLEDGE has the advantage of versatility; (ii) second-order optimal, (ii) in the PL region, and (iii) smaller complexity under less of data.
arXiv Detail & Related papers (2022-09-01T11:05:26Z)
An Algebraically Converging Stochastic Gradient Descent Algorithm for Global Optimization [14.336473214524663]
A key component in the algorithm is the randomness based on the value of the objective function. We prove the convergence of the algorithm with an algebra and tuning in the parameter space. We present several numerical examples to demonstrate the efficiency and robustness of the algorithm.
arXiv Detail & Related papers (2022-04-12T16:27:49Z)
Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information. We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z)
Efficient Methods for Structured Nonconvex-Nonconcave Min-Max Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point. The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z)
Accelerated Algorithms for Convex and Non-Convex Optimization on Manifolds [9.632674803757475]
We propose a scheme for solving convex and non- optimization problems on distance. Our proposed algorithm adapts to the level of complexity in the objective function.
arXiv Detail & Related papers (2020-10-18T02:48:22Z)
Bilevel Optimization: Convergence Analysis and Enhanced Design [63.64636047748605]
Bilevel optimization is a tool for many machine learning problems. We propose a novel stoc-efficientgradient estimator named stoc-BiO.
arXiv Detail & Related papers (2020-10-15T18:09:48Z)
Projection-Free Adaptive Gradients for Large-Scale Optimization [22.0439695290991]
Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens by querying only approximate first-order information from the objective. We show that our method can improve the performance of adaptive algorithms for constrained optimization.
arXiv Detail & Related papers (2020-09-29T15:56:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.