Related papers: Distributed and Stochastic Optimization Methods with Gradient Compression and Local Steps

Distributed and Stochastic Optimization Methods with Gradient Compression and Local Steps

URL: http://arxiv.org/abs/2112.10645v1
Date: Mon, 20 Dec 2021 16:12:54 GMT
Title: Distributed and Stochastic Optimization Methods with Gradient Compression and Local Steps
Authors: Eduard Gorbunov
Abstract summary: We propose theoretical frameworks for the analysis and distributed methods with error compensation and local updates. We develop more than 20 new optimization methods, including the first linearly converging Error-pensated and first distributed Local-SGD methods.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this thesis, we propose new theoretical frameworks for the analysis of stochastic and distributed methods with error compensation and local updates. Using these frameworks, we develop more than 20 new optimization methods, including the first linearly converging Error-Compensated SGD and the first linearly converging Local-SGD for arbitrarily heterogeneous local functions. Moreover, the thesis contains several new distributed methods with unbiased compression for distributed non-convex optimization problems. The derived complexity results for these methods outperform the previous best-known results for the considered problems. Finally, we propose a new scalable decentralized fault-tolerant distributed method, and under reasonable assumptions, we derive the iteration complexity bounds for this method that match the ones of centralized Local-SGD.

Related papers

What Makes Local Updates Effective: The Role of Data Heterogeneity and Smoothness [5.357435119431715]
The thesis contributes to a self-contained guide for analyzing Local SGD in heterogeneous environments.<n>The thesis also extends to online learning, providing fundamental bounds under both first-order and bandit feedback.
arXiv Detail & Related papers (2025-06-30T19:06:02Z)
Stochastic Optimization with Optimal Importance Sampling [49.484190237840714]
We propose an iterative-based algorithm that jointly updates the decision and the IS distribution without requiring time-scale separation between the two. Our method achieves the lowest possible variable variance and guarantees global convergence under convexity of the objective and mild assumptions on the IS distribution family.
arXiv Detail & Related papers (2025-04-04T16:10:18Z)
Direct Distributional Optimization for Provable Alignment of Diffusion Models [39.048284342436666]
We introduce a novel alignment method for diffusion models from distribution optimization perspectives. We first formulate the problem as a generic regularized loss minimization over probability distributions. We enable sampling from the learned distribution by approximating its score function via Doob's $h$-transform technique.
arXiv Detail & Related papers (2025-02-05T07:35:15Z)
Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise [96.80184504268593]
gradient, clipping is one of the key algorithmic ingredients to derive good high-probability guarantees. Clipping can spoil the convergence of the popular methods for composite and distributed optimization.
arXiv Detail & Related papers (2023-10-03T07:49:17Z)
Resource-Adaptive Newton's Method for Distributed Learning [16.588456212160928]
This paper introduces a novel and efficient algorithm called RANL, which overcomes the limitations of Newton's method. Unlike traditional first-order methods, RANL exhibits remarkable independence from the condition number of the problem.
arXiv Detail & Related papers (2023-08-20T04:01:30Z)
Stochastic Gradient Descent-Ascent: Unified Theory and New Efficient Methods [73.35353358543507]
Gradient Descent-Ascent (SGDA) is one of the most prominent algorithms for solving min-max optimization and variational inequalities problems (VIP) In this paper, we propose a unified convergence analysis that covers a large variety of descent-ascent methods. We develop several new variants of SGDA such as a new variance-reduced method (L-SVRGDA), new distributed methods with compression (QSGDA, DIANA-SGDA, VR-DIANA-SGDA), and a new method with coordinate randomization (SEGA-SGDA)
arXiv Detail & Related papers (2022-02-15T09:17:39Z)
Distributionally Robust Federated Averaging [19.875176871167966]
We present communication efficient distributed algorithms for robust learning periodic averaging with adaptive sampling. We give corroborating experimental evidence for our theoretical results in federated learning settings.
arXiv Detail & Related papers (2021-02-25T03:32:09Z)
IDEAL: Inexact DEcentralized Accelerated Augmented Lagrangian Method [64.15649345392822]
We introduce a framework for designing primal methods under the decentralized optimization setting where local functions are smooth and strongly convex. Our approach consists of approximately solving a sequence of sub-problems induced by the accelerated augmented Lagrangian method. When coupled with accelerated gradient descent, our framework yields a novel primal algorithm whose convergence rate is optimal and matched by recently derived lower bounds.
arXiv Detail & Related papers (2020-06-11T18:49:06Z)
FedSplit: An algorithmic framework for fast federated optimization [40.42352500741025]
We introduce FedSplit, a class of algorithms for solving distributed convex minimization with additive structure. Our theory shows that these methods are provably robust to inexact computation of intermediate local quantities.
arXiv Detail & Related papers (2020-05-11T16:30:09Z)
Optimization in Machine Learning: A Distribution Space Approach [16.038814087205143]
We present the viewpoint that optimization problems in machine learning can often be interpreted as minimizing a convex functional over a function space. We repose such problems via a suitable relaxation as convex optimization problems in the space distributions. We develop a numerical algorithm based on mixture distributions to perform approximate optimization directly in distribution space.
arXiv Detail & Related papers (2020-04-18T13:38:06Z)
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates [70.9701218475002]
We introduce a unified convergence analysis of decentralized communication methods. We derive universal convergence rates for several applications. Our proofs rely on weak assumptions.
arXiv Detail & Related papers (2020-03-23T17:49:15Z)
Majorization Minimization Methods for Distributed Pose Graph Optimization with Convergence Guarantees [0.76146285961466]
We show that our proposed methods are guaranteed to converge to first-order critical points under mild conditions. Since our proposed methods rely a proximal operator of distributed PGO, the convergence rate can be significantly accelerated. The efficacy of this work is validated through applications on a number of 2D and 3D SLAM datasets.
arXiv Detail & Related papers (2020-03-11T15:18:33Z)
Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.