Related papers: Self-Concordant Analysis of Frank-Wolfe Algorithms

Self-Concordant Analysis of Frank-Wolfe Algorithms

URL: http://arxiv.org/abs/2002.04320v3
Date: Sat, 27 Jun 2020 04:35:35 GMT
Title: Self-Concordant Analysis of Frank-Wolfe Algorithms
Authors: Pavel Dvurechensky, Petr Ostroukhov, Kamil Safin, Shimrit Shtern, Mathias Staudigl
Abstract summary: In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations.
Score: 3.3598755777055374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Projection-free optimization via different variants of the Frank-Wolfe (FW), a.k.a. Conditional Gradient method has become one of the cornerstones in optimization for machine learning since in many cases the linear minimization oracle is much cheaper to implement than projections and some sparsity needs to be preserved. In a number of applications, e.g. Poisson inverse problems or quantum state tomography, the loss is given by a self-concordant (SC) function having unbounded curvature, implying absence of theoretical guarantees for the existing FW methods. We use the theory of SC functions to provide a new adaptive step size for FW methods and prove global convergence rate O(1/k) after k iterations. If the problem admits a stronger local linear minimization oracle, we construct a novel FW method with linear convergence rate for SC functions.

Related papers

Langevin Multiplicative Weights Update with Applications in Polynomial Portfolio Management [14.310970006771717]
We show that LMvinvin based gradient local minima with a non-asymptotic convergence analysis. We show that LMvinvin algorithm is provably convergent global minima with a non-asymptotic convergence analysis.
arXiv Detail & Related papers (2025-02-26T15:13:08Z)
Riemannian Federated Learning via Averaging Gradient Stream [8.75592575216789]
This paper develops and analyzes an efficient Federated Averaging Gradient Stream (RFedAGS) algorithm. Numerical simulations conducted on synthetic and real-world data demonstrate the performance of the proposed RFedAGS.
arXiv Detail & Related papers (2024-09-11T12:28:42Z)
Fast Unconstrained Optimization via Hessian Averaging and Adaptive Gradient Sampling Methods [0.3222802562733786]
We consider minimizing finite-sum expectation objective functions via Hessian-averaging based subsampled Newton methods. These methods allow for inexactness and have fixed per-it Hessian approximation costs. We present novel analysis techniques and propose challenges for their practical implementation.
arXiv Detail & Related papers (2024-08-14T03:27:48Z)
Constrained Optimization via Exact Augmented Lagrangian and Randomized Iterative Sketching [55.28394191394675]
We develop an adaptive inexact Newton method for equality-constrained nonlinear, nonIBS optimization problems. We demonstrate the superior performance of our method on benchmark nonlinear problems, constrained logistic regression with data from LVM, and a PDE-constrained problem.
arXiv Detail & Related papers (2023-05-28T06:33:37Z)
Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features [65.64276393443346]
The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints. We present two new variants of the algorithms for minimization of the finite-sum gradient.
arXiv Detail & Related papers (2023-04-23T20:05:09Z)
Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points. Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
Using Taylor-Approximated Gradients to Improve the Frank-Wolfe Method for Empirical Risk Minimization [1.4504054468850665]
In Empirical Minimization -- Minimization -- we present a novel computational step-size approach for which we have computational guarantees. We show that our methods exhibit very significant problems on realworld binary datasets. We also present a novel adaptive step-size approach for which we have computational guarantees.
arXiv Detail & Related papers (2022-08-30T00:08:37Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error. Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z)
Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization [30.980431848476925]
We propose an algorithm for constrained finite smooth-sum minimization with a generalized linear prediction/structure. The proposed method is simple to implement, does not require step-size tuning, and has a constant per-iteration independent of the dataset size. We provide an implementation of all considered methods in an open-source package.
arXiv Detail & Related papers (2020-02-27T00:47:21Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.