On the Performance of Gradient Tracking with Local Updates
- URL: http://arxiv.org/abs/2210.04757v2
- Date: Thu, 13 Oct 2022 00:18:55 GMT
- Title: On the Performance of Gradient Tracking with Local Updates
- Authors: Edward Duc Hien Nguyen and Sulaiman A. Alghunaim and Kun Yuan and
C\'esar A. Uribe
- Abstract summary: We show that LU-GT has the same communication quality but allows arbitrary network topologies.
Numerical examples reveal that local updates can lower communication costs in certain regimes.
- Score: 10.14746251086461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the decentralized optimization problem where a network of $n$ agents
seeks to minimize the average of a set of heterogeneous non-convex cost
functions distributedly. State-of-the-art decentralized algorithms like Exact
Diffusion~(ED) and Gradient Tracking~(GT) involve communicating every
iteration. However, communication is expensive, resource intensive, and slow.
In this work, we analyze a locally updated GT method (LU-GT), where agents
perform local recursions before interacting with their neighbors. While local
updates have been shown to reduce communication overhead in practice, their
theoretical influence has not been fully characterized. We show LU-GT has the
same communication complexity as the Federated Learning setting but allows
arbitrary network topologies. In addition, we prove that the number of local
updates does not degrade the quality of the solution achieved by LU-GT.
Numerical examples reveal that local updates can lower communication costs in
certain regimes (e.g., well-connected graphs).
Related papers
- Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning? [50.03434441234569]
Federated Learning (FL) has gained significant popularity due to its effectiveness in training machine learning models across diverse sites without requiring direct data sharing.
While various algorithms have shown that FL with local updates is a communication-efficient distributed learning framework, the generalization performance of FL with local updates has received comparatively less attention.
arXiv Detail & Related papers (2024-09-05T19:00:18Z) - The Effectiveness of Local Updates for Decentralized Learning under Data Heterogeneity [2.845817138242963]
We revisit two fundamental decentralized optimization methods, Decentralized Gradient Tracking (DGT) and Decentralized Gradient Descent (DGD)
We demonstrate that incorporating $K > 1$ local update steps can reduce communication complexity.
arXiv Detail & Related papers (2024-03-23T00:01:34Z) - FedSpeed: Larger Local Interval, Less Communication Round, and Higher
Generalization Accuracy [84.45004766136663]
Federated learning is an emerging distributed machine learning framework.
It suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting.
We propose a novel and practical method, FedSpeed, to alleviate the negative impacts posed by these problems.
arXiv Detail & Related papers (2023-02-21T03:55:29Z) - Decentralized Gradient Tracking with Local Steps [37.41185751866015]
Gradient tracking (GT) is an algorithm designed for solving decentralized optimization problems over a network.
A key feature of GT is a tracking mechanism that allows to overcome heterogeneity between nodes.
arXiv Detail & Related papers (2023-01-03T19:06:41Z) - A Communication-efficient Algorithm with Linear Convergence for
Federated Minimax Learning [1.713291434132985]
We study a large-scale multi-agent minimax optimization problem, which models Geneimation Adversarial Networks (GANs)
The overall objective is a sum of agents' private local objective functions.
We show that FedGDA-GT converges linearly with a constant stepsize to global $epsilon GDA solution.
arXiv Detail & Related papers (2022-06-02T16:31:16Z) - The Role of Local Steps in Local SGD [2.3572498744567123]
We consider the distributed optimization problem where $n$ agents want a global function given by the sum of their local functions.
We study the effect steps on the convergence state of the communication complexity of Local SGD.
arXiv Detail & Related papers (2022-03-14T00:54:43Z) - Escaping Saddle Points with Bias-Variance Reduced Local Perturbed SGD
for Communication Efficient Nonconvex Distributed Learning [58.79085525115987]
Local methods are one of the promising approaches to reduce communication time.
We show that the communication complexity is better than non-local methods when the local datasets is smaller than the smoothness local loss.
arXiv Detail & Related papers (2022-02-12T15:12:17Z) - An Entropy-guided Reinforced Partial Convolutional Network for Zero-Shot
Learning [77.72330187258498]
We propose a novel Entropy-guided Reinforced Partial Convolutional Network (ERPCNet)
ERPCNet extracts and aggregates localities based on semantic relevance and visual correlations without human-annotated regions.
It not only discovers global-cooperative localities dynamically but also converges faster for policy gradient optimization.
arXiv Detail & Related papers (2021-11-03T11:13:13Z) - Distributed Q-Learning with State Tracking for Multi-agent Networked
Control [61.63442612938345]
This paper studies distributed Q-learning for Linear Quadratic Regulator (LQR) in a multi-agent network.
We devise a state tracking (ST) based Q-learning algorithm to design optimal controllers for agents.
arXiv Detail & Related papers (2020-12-22T22:03:49Z) - Communication-efficient Decentralized Local SGD over Undirected Networks [2.3572498744567123]
We consider the distributed learning problem where a network of $n$ agents seeks to minimize a global function $F$.
We analyze the trade-off between the number of communication rounds and the computational effort of each agent.
Our results show that by using only $R=Omega(n)$ communication rounds, one can achieve an error that scales as $O(1/nT)$.
arXiv Detail & Related papers (2020-11-06T09:34:00Z) - A Unified Theory of Decentralized SGD with Changing Topology and Local
Updates [70.9701218475002]
We introduce a unified convergence analysis of decentralized communication methods.
We derive universal convergence rates for several applications.
Our proofs rely on weak assumptions.
arXiv Detail & Related papers (2020-03-23T17:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.