Delayed Projection Techniques for Linearly Constrained Problems:
Convergence Rates, Acceleration, and Applications
- URL: http://arxiv.org/abs/2101.01505v1
- Date: Tue, 5 Jan 2021 13:42:41 GMT
- Title: Delayed Projection Techniques for Linearly Constrained Problems:
Convergence Rates, Acceleration, and Applications
- Authors: Xiang Li, Zhihua Zhang
- Abstract summary: We study a novel class of projection-based algorithms for linearly constrained problems (LCPs)
We propose the delayed projection technique that calls a projection once for a while, lowering the projection frequency and improving the projection efficiency.
We show that it is feasible to improve projection efficiency in both strongly convex and generally convex cases.
- Score: 24.763531954075656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we study a novel class of projection-based algorithms for
linearly constrained problems (LCPs) which have a lot of applications in
statistics, optimization, and machine learning. Conventional primal
gradient-based methods for LCPs call a projection after each (stochastic)
gradient descent, resulting in that the required number of projections equals
that of gradient descents (or total iterations). Motivated by the recent
progress in distributed optimization, we propose the delayed projection
technique that calls a projection once for a while, lowering the projection
frequency and improving the projection efficiency. Accordingly, we devise a
series of stochastic methods for LCPs using the technique, including a variance
reduced method and an accelerated one. We theoretically show that it is
feasible to improve projection efficiency in both strongly convex and generally
convex cases. Our analysis is simple and unified and can be easily extended to
other methods using delayed projections. When applying our new algorithms to
federated optimization, a newfangled and privacy-preserving subfield in
distributed optimization, we obtain not only a variance reduced federated
algorithm with convergence rates better than previous works, but also the first
accelerated method able to handle data heterogeneity inherent in federated
optimization.
Related papers
- Differentially Private Optimization with Sparse Gradients [60.853074897282625]
We study differentially private (DP) optimization problems under sparsity of individual gradients.
Building on this, we obtain pure- and approximate-DP algorithms with almost optimal rates for convex optimization with sparse gradients.
arXiv Detail & Related papers (2024-04-16T20:01:10Z) - Ordering for Non-Replacement SGD [7.11967773739707]
We seek to find an ordering that can improve the convergence rates for the non-replacement form of the algorithm.
We develop optimal orderings for constant and decreasing step sizes for strongly convex and convex functions.
In addition, we are able to combine the ordering with mini-batch and further apply it to more complex neural networks.
arXiv Detail & Related papers (2023-06-28T00:46:58Z) - Versatile Single-Loop Method for Gradient Estimator: First and Second
Order Optimality, and its Application to Federated Learning [45.78238792836363]
We present a single-loop algorithm named SLEDGE (Single-Loop-E Gradient Estimator) for periodic convergence.
Unlike existing methods, SLEDGE has the advantage of versatility; (ii) second-order optimal, (ii) in the PL region, and (iii) smaller complexity under less of data.
arXiv Detail & Related papers (2022-09-01T11:05:26Z) - Sharp Analysis of Sketch-and-Project Methods via a Connection to
Randomized Singular Value Decomposition [14.453949553412821]
We develop a theoretical framework for obtaining sharp guarantees on the convergence rate of sketch-and-project methods.
We show that the convergence rate improves at least linearly with the sketch size, and even faster when the data matrix exhibits certain spectral decays.
Our experiments support the theory and demonstrate that even extremely sparse sketches exhibit the convergence properties predicted by our framework.
arXiv Detail & Related papers (2022-08-20T03:11:13Z) - On Constraints in First-Order Optimization: A View from Non-Smooth
Dynamical Systems [99.59934203759754]
We introduce a class of first-order methods for smooth constrained optimization.
Two distinctive features of our approach are that projections or optimizations over the entire feasible set are avoided.
The resulting algorithmic procedure is simple to implement even when constraints are nonlinear.
arXiv Detail & Related papers (2021-07-17T11:45:13Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Projection-Free Adaptive Gradients for Large-Scale Optimization [22.0439695290991]
Frank-Wolfe algorithms occupy a unique position as they alleviate both computational burdens by querying only approximate first-order information from the objective.
We show that our method can improve the performance of adaptive algorithms for constrained optimization.
arXiv Detail & Related papers (2020-09-29T15:56:12Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z) - IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex.
Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method.
When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.