Related papers: DFWLayer: Differentiable Frank-Wolfe Optimization Layer

DFWLayer: Differentiable Frank-Wolfe Optimization Layer

URL: http://arxiv.org/abs/2308.10806v2
Date: Fri, 29 Mar 2024 06:15:41 GMT
Title: DFWLayer: Differentiable Frank-Wolfe Optimization Layer
Authors: Zixuan Liu, Liu Liu, Xueqian Wang, Peilin Zhao,
Abstract summary: Differentiable optimization has received a significant amount of attention due to its foundational role in the domain of machine learning based on neural networks. This paper proposes a differentiable layer, named Differentiable Frank-Wolfe Layer (DFWLayer), by rolling out the Frank-Wolfe method. Experimental results demonstrate that the DFWLayer not only attains competitive accuracy in solutions and gradients but also consistently adheres to constraints.
Score: 33.20550212188619
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differentiable optimization has received a significant amount of attention due to its foundational role in the domain of machine learning based on neural networks. This paper proposes a differentiable layer, named Differentiable Frank-Wolfe Layer (DFWLayer), by rolling out the Frank-Wolfe method, a well-known optimization algorithm which can solve constrained optimization problems without projections and Hessian matrix computations, thus leading to an efficient way of dealing with large-scale convex optimization problems with norm constraints. Experimental results demonstrate that the DFWLayer not only attains competitive accuracy in solutions and gradients but also consistently adheres to constraints.

Related papers

Scalable Min-Max Optimization via Primal-Dual Exact Pareto Optimization [66.51747366239299]
We propose a smooth variant of the min-max problem based on the augmented Lagrangian. The proposed algorithm scales better with the number of objectives than subgradient-based strategies.
arXiv Detail & Related papers (2025-03-16T11:05:51Z)
Revisiting Frank-Wolfe for Structured Nonconvex Optimization [33.44652927142219]
We introduce a new projection (Frank-Wolfe) method for optimizing structured non functions that are expressed as a difference of two convex functions. We prove that the proposed method achieves in $O-(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(O-)$(
arXiv Detail & Related papers (2025-03-11T22:09:44Z)
An Enhanced Zeroth-Order Stochastic Frank-Wolfe Framework for Constrained Finite-Sum Optimization [15.652261277429968]
We propose an enhanced zeroth-order convex computation Frank-Wolfe to address constrained finite-sum optimization problems. Our method introduces a novel double variance reduction framework that effectively reduces the approximation induced by zeroth-order oracles.
arXiv Detail & Related papers (2025-01-13T10:53:19Z)
BPQP: A Differentiable Convex Optimization Framework for Efficient End-to-End Learning [17.662882719189373]
We introduce BPQP, a differentiable convex optimization framework for efficient end-to-end learning. To enhance efficiency, we reformulate the backward pass as a simplified and decoupled quadratic programming problem. Extensive experiments on both simulated and real-world datasets demonstrate that BPQP achieves a significant improvement in efficiency.
arXiv Detail & Related papers (2024-11-28T17:31:15Z)
Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions. Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms. We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z)
Learning the Efficient Frontier [0.01874930567916036]
We introduce NeuralEF: a fast neural approximation framework that robustly forecasts the result of the efficient frontier (EF) convex optimization problem. We show that NeuralEF is a viable solution to accelerate large-scale simulation while handling discontinuous behavior.
arXiv Detail & Related papers (2023-09-27T16:49:37Z)
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features [65.64276393443346]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints. We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
An Empirical Evaluation of Zeroth-Order Optimization Methods on AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives. We show the advantages of ZO sign-based gradient descent (ZO-signGD) We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z)
Efficient differentiable quadratic programming layers: an ADMM approach [0.0]
We present an alternative network layer architecture based on the alternating direction method of multipliers (ADMM) Backward differentiation is performed by implicit differentiation of the residual map of a modified fixed-point iteration. Simulated results demonstrate the computational advantage of the ADMM layer, which for medium scaled problems is approximately an order of magnitude faster than the OptNet quadratic programming layer.
arXiv Detail & Related papers (2021-12-14T15:25:07Z)
Convex Optimization with an Interpolation-based Projection and its Application to Deep Learning [36.19092177858517]
We investigate whether an inexact, but cheaper projection can drive a descent algorithm to an optimum. Specifically, we propose an inexact-based projection that is computationally cheap and easy to compute given a convex, domain defining, function.
arXiv Detail & Related papers (2020-11-13T16:52:50Z)
Projection-Free Adaptive Gradients for Large-Scale Optimization [22.0439695290991]
Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens by querying only approximate first-order information from the objective. We show that our method can improve the performance of adaptive algorithms for constrained optimization.
arXiv Detail & Related papers (2020-09-29T15:56:12Z)
Efficient Optimization of Dominant Set Clustering with Frank-Wolfe Algorithms [4.873362301533825]
We study Frank-Wolfe algorithms -- standard, pairwise, and away-steps -- for efficient optimization of Dominant Set Clustering. We present a unified and computationally efficient framework to employ the different variants of Frank-Wolfe methods.
arXiv Detail & Related papers (2020-07-22T20:08:41Z)
A Flexible Framework for Designing Trainable Priors with Adaptive Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems. We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions. This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.