Related papers: On the Computation-Communication Trade-Off with A Flexible Gradient Tracking Approach

On the Computation-Communication Trade-Off with A Flexible Gradient Tracking Approach

URL: http://arxiv.org/abs/2306.07159v1
Date: Mon, 12 Jun 2023 14:46:21 GMT
Title: On the Computation-Communication Trade-Off with A Flexible Gradient Tracking Approach
Authors: Yan Huang and Jinming Xu
Abstract summary: We propose a flexible gradient tracking approach with adjustable computation and communication steps for solving distributed optimization problem over networks. We derive both the computation and communication complexities for achieving arbitrary accuracy on smooth and strongly convex objective functions.
Score: 6.877328172726638
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a flexible gradient tracking approach with adjustable computation and communication steps for solving distributed stochastic optimization problem over networks. The proposed method allows each node to perform multiple local gradient updates and multiple inter-node communications in each round, aiming to strike a balance between computation and communication costs according to the properties of objective functions and network topology in non-i.i.d. settings. Leveraging a properly designed Lyapunov function, we derive both the computation and communication complexities for achieving arbitrary accuracy on smooth and strongly convex objective functions. Our analysis demonstrates sharp dependence of the convergence performance on graph topology and properties of objective functions, highlighting the trade-off between computation and communication. Numerical experiments are conducted to validate our theoretical findings.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Communication-Efficient Stochastic Distributed Learning [3.2923780772605595]
We address distributed learning problems, both non and convex, undirected networks. In particular, we design a novel based on the distributed Alternating Method of Multipliers (MM) to address the challenges of high communication costs.
arXiv Detail & Related papers (2025-01-23T10:05:23Z)
Decentralized Federated Learning with Gradient Tracking over Time-Varying Directed Networks [42.92231921732718]
We propose a consensus-based algorithm called DSGTm-TV. It incorporates gradient tracking and heavy-ball momentum to optimize a global objective function. Under DSGTm-TV, agents will update local model parameters and gradient estimates using information exchange with neighboring agents.
arXiv Detail & Related papers (2024-09-25T06:23:16Z)
Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp) We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings. For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error. In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z)
Personalized Decentralized Multi-Task Learning Over Dynamic Communication Graphs [59.96266198512243]
We propose a decentralized and federated learning algorithm for tasks that are positively and negatively correlated. Our algorithm uses gradients to calculate the correlations among tasks automatically, and dynamically adjusts the communication graph to connect mutually beneficial tasks and isolate those that may negatively impact each other. We conduct experiments on a synthetic Gaussian dataset and a large-scale celebrity attributes (CelebA) dataset.
arXiv Detail & Related papers (2022-12-21T18:58:24Z)
ECO-TR: Efficient Correspondences Finding Via Coarse-to-Fine Refinement [80.94378602238432]
We propose an efficient structure named Correspondence Efficient Transformer (ECO-TR) by finding correspondences in a coarse-to-fine manner. To achieve this, multiple transformer blocks are stage-wisely connected to gradually refine the predicted coordinates. Experiments on various sparse and dense matching tasks demonstrate the superiority of our method in both efficiency and effectiveness against existing state-of-the-arts.
arXiv Detail & Related papers (2022-09-25T13:05:33Z)
Green, Quantized Federated Learning over Wireless Networks: An Energy-Efficient Design [68.86220939532373]
The finite precision level is captured through the use of quantized neural networks (QNNs) that quantize weights and activations in fixed-precision format. The proposed FL framework can reduce energy consumption until convergence by up to 70% compared to a baseline FL algorithm.
arXiv Detail & Related papers (2022-07-19T16:37:24Z)
Fundamental Limits of Communication Efficiency for Model Aggregation in Distributed Learning: A Rate-Distortion Approach [54.311495894129585]
We study the limit of communication cost of model aggregation in distributed learning from a rate-distortion perspective. It is found that the communication gain by exploiting the correlation between worker nodes is significant for SignSGD.
arXiv Detail & Related papers (2022-06-28T13:10:40Z)
Push--Pull with Device Sampling [8.344476599818826]
We consider decentralized optimization problems in which a number of agents collaborate to minimize the average of their local functions by exchanging over an underlying communication graph. We propose an algorithm that combines gradient tracking and variance reduction over the entire network. Our theoretical analysis shows that the algorithm converges linearly, when the local objective functions are strongly convex.
arXiv Detail & Related papers (2022-06-08T18:18:18Z)
Data-heterogeneity-aware Mixing for Decentralized Learning [63.83913592085953]
We characterize the dependence of convergence on the relationship between the mixing weights of the graph and the data heterogeneity across nodes. We propose a metric that quantifies the ability of a graph to mix the current gradients. Motivated by our analysis, we propose an approach that periodically and efficiently optimize the metric.
arXiv Detail & Related papers (2022-04-13T15:54:35Z)
Neural Network Approximations of Compositional Functions With Applications to Dynamical Systems [3.660098145214465]
We develop an approximation theory for compositional functions and their neural network approximations. We identify a set of key features of compositional functions and the relationship between the features and the complexity of neural networks. In addition to function approximations, we prove several formulae of error upper bounds for neural networks.
arXiv Detail & Related papers (2020-12-03T04:40:25Z)
Federated Learning with Compression: Unified Analysis and Sharp Guarantees [39.092596142018195]
Communication cost is often a critical bottleneck to scale up distributed optimization algorithms to collaboratively learn a model from millions of devices. Two notable trends to deal with the communication overhead of federated compression and computation are unreliable compression and heterogeneous communication. We analyze their convergence in both homogeneous and heterogeneous data distribution settings.
arXiv Detail & Related papers (2020-07-02T14:44:07Z)
Communication-efficient Variance-reduced Stochastic Gradient Descent [0.0]
We consider the problem of communication efficient distributed optimization. In particular, we focus on the variance-reduced gradient and propose a novel approach to make it communication-efficient. Comprehensive theoretical and numerical analyses on real datasets reveal that our algorithm can significantly reduce the communication complexity, by as much as 95%, with almost no noticeable penalty.
arXiv Detail & Related papers (2020-03-10T13:22:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.