Related papers: On the Performance of Gradient Tracking with Local Updates

On the Performance of Gradient Tracking with Local Updates

URL: http://arxiv.org/abs/2210.04757v2
Date: Thu, 13 Oct 2022 00:18:55 GMT
Title: On the Performance of Gradient Tracking with Local Updates
Authors: Edward Duc Hien Nguyen and Sulaiman A. Alghunaim and Kun Yuan and C\'esar A. Uribe
Abstract summary: We show that LU-GT has the same communication quality but allows arbitrary network topologies. Numerical examples reveal that local updates can lower communication costs in certain regimes.
Score: 10.14746251086461
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the decentralized optimization problem where a network of $n$ agents seeks to minimize the average of a set of heterogeneous non-convex cost functions distributedly. State-of-the-art decentralized algorithms like Exact Diffusion~(ED) and Gradient Tracking~(GT) involve communicating every iteration. However, communication is expensive, resource intensive, and slow. In this work, we analyze a locally updated GT method (LU-GT), where agents perform local recursions before interacting with their neighbors. While local updates have been shown to reduce communication overhead in practice, their theoretical influence has not been fully characterized. We show LU-GT has the same communication complexity as the Federated Learning setting but allows arbitrary network topologies. In addition, we prove that the number of local updates does not degrade the quality of the solution achieved by LU-GT. Numerical examples reveal that local updates can lower communication costs in certain regimes (e.g., well-connected graphs).

Related papers

Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning? [50.03434441234569]
Federated Learning (FL) has gained significant popularity due to its effectiveness in training machine learning models across diverse sites without requiring direct data sharing. While various algorithms have shown that FL with local updates is a communication-efficient distributed learning framework, the generalization performance of FL with local updates has received comparatively less attention.
arXiv Detail & Related papers (2024-09-05T19:00:18Z)
The Effectiveness of Local Updates for Decentralized Learning under Data Heterogeneity [2.845817138242963]
We revisit two fundamental decentralized optimization methods, Decentralized Gradient Tracking (DGT) and Decentralized Gradient Descent (DGD) We demonstrate that incorporating $K > 1$ local update steps can reduce communication complexity.
arXiv Detail & Related papers (2024-03-23T00:01:34Z)
FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy [84.45004766136663]
Federated learning is an emerging distributed machine learning framework. It suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting. We propose a novel and practical method, FedSpeed, to alleviate the negative impacts posed by these problems.
arXiv Detail & Related papers (2023-02-21T03:55:29Z)
Decentralized Gradient Tracking with Local Steps [37.41185751866015]
Gradient tracking (GT) is an algorithm designed for solving decentralized optimization problems over a network. A key feature of GT is a tracking mechanism that allows to overcome heterogeneity between nodes.
arXiv Detail & Related papers (2023-01-03T19:06:41Z)
A Communication-efficient Algorithm with Linear Convergence for Federated Minimax Learning [1.713291434132985]
We study a large-scale multi-agent minimax optimization problem, which models Geneimation Adversarial Networks (GANs) The overall objective is a sum of agents' private local objective functions. We show that FedGDA-GT converges linearly with a constant stepsize to global $epsilon GDA solution.
arXiv Detail & Related papers (2022-06-02T16:31:16Z)
The Role of Local Steps in Local SGD [2.3572498744567123]
We consider the distributed optimization problem where $n$ agents want a global function given by the sum of their local functions. We study the effect steps on the convergence state of the communication complexity of Local SGD.
arXiv Detail & Related papers (2022-03-14T00:54:43Z)
Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time. We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z)
An Entropy-guided Reinforced Partial Convolutional Network for Zero-Shot Learning [77.72330187258498]
We propose a novel Entropy-guided Reinforced Partial Convolutional Network (ERPCNet) ERPCNet extracts and aggregates localities based on semantic relevance and visual correlations without human-annotated regions. It not only discovers global-cooperative localities dynamically but also converges faster for policy gradient optimization.
arXiv Detail & Related papers (2021-11-03T11:13:13Z)
Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes. An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps. This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z)
Distributed Q-Learning with State Tracking for Multi-agent Networked Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network. We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z)
Communication-efficient Decentralized Local SGD over Undirected Networks [2.3572498744567123]
We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$. We analyze the trade-off between the number of communication rounds and the computational effort of each agent. Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
arXiv Detail & Related papers (2020-11-06T09:34:00Z)
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates [70.9701218475002]
We introduce a unified convergence analysis of decentralized communication methods. We derive universal convergence rates for several applications. Our proofs rely on weak assumptions.
arXiv Detail & Related papers (2020-03-23T17:49:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.