Intermittent Pulling with Local Compensation for Communication-Efficient
Federated Learning
- URL: http://arxiv.org/abs/2001.08277v1
- Date: Wed, 22 Jan 2020 20:53:14 GMT
- Title: Intermittent Pulling with Local Compensation for Communication-Efficient
Federated Learning
- Authors: Haozhao Wang, Zhihao Qu, Song Guo, Xin Gao, Ruixuan Li, and Baoliu Ye
- Abstract summary: Federated Learning is a powerful machine learning paradigm to train a global model with highly distributed data.
A major bottleneck on the performance of distributed SGD is the communication overhead on pushing local and pulling global model.
We propose a novel approach named Gradient Pulling Compensation (PRLC) to reduce communication overhead.
- Score: 20.964434898554344
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning is a powerful machine learning paradigm to cooperatively
train a global model with highly distributed data. A major bottleneck on the
performance of distributed Stochastic Gradient Descent (SGD) algorithm for
large-scale Federated Learning is the communication overhead on pushing local
gradients and pulling global model. In this paper, to reduce the communication
complexity of Federated Learning, a novel approach named Pulling Reduction with
Local Compensation (PRLC) is proposed. Specifically, each training node
intermittently pulls the global model from the server in SGD iterations,
resulting in that it is sometimes unsynchronized with the server. In such a
case, it will use its local update to compensate the gap between the local
model and the global model. Our rigorous theoretical analysis of PRLC achieves
two important findings. First, we prove that the convergence rate of PRLC
preserves the same order as the classical synchronous SGD for both
strongly-convex and non-convex cases with good scalability due to the linear
speedup with respect to the number of training nodes. Second, we show that PRLC
admits lower pulling frequency than the existing pulling reduction method
without local compensation. We also conduct extensive experiments on various
machine learning models to validate our theoretical results. Experimental
results show that our approach achieves a significant pulling reduction over
the state-of-the-art methods, e.g., PRLC requiring only half of the pulling
operations of LAG.
Related papers
- Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - FedSpeed: Larger Local Interval, Less Communication Round, and Higher
Generalization Accuracy [84.45004766136663]
Federated learning is an emerging distributed machine learning framework.
It suffers from the non-vanishing biases introduced by the local inconsistent optimal and the rugged client-drifts by the local over-fitting.
We propose a novel and practical method, FedSpeed, to alleviate the negative impacts posed by these problems.
arXiv Detail & Related papers (2023-02-21T03:55:29Z) - Integrating Local Real Data with Global Gradient Prototypes for
Classifier Re-Balancing in Federated Long-Tailed Learning [60.41501515192088]
Federated Learning (FL) has become a popular distributed learning paradigm that involves multiple clients training a global model collaboratively.
The data samples usually follow a long-tailed distribution in the real world, and FL on the decentralized and long-tailed data yields a poorly-behaved global model.
In this work, we integrate the local real data with the global gradient prototypes to form the local balanced datasets.
arXiv Detail & Related papers (2023-01-25T03:18:10Z) - Decentralized Event-Triggered Federated Learning with Heterogeneous
Communication Thresholds [12.513477328344255]
We propose a novel methodology for distributed model aggregations via asynchronous, event-triggered consensus iterations over a network graph topology.
We demonstrate that our methodology achieves the globally optimal learning model under standard assumptions in distributed learning and graph consensus literature.
arXiv Detail & Related papers (2022-04-07T20:35:37Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - Parallel Successive Learning for Dynamic Distributed Model Training over
Heterogeneous Wireless Networks [50.68446003616802]
Federated learning (FedL) has emerged as a popular technique for distributing model training over a set of wireless devices.
We develop parallel successive learning (PSL), which expands the FedL architecture along three dimensions.
Our analysis sheds light on the notion of cold vs. warmed up models, and model inertia in distributed machine learning.
arXiv Detail & Related papers (2022-02-07T05:11:01Z) - Learn Locally, Correct Globally: A Distributed Algorithm for Training
Graph Neural Networks [22.728439336309858]
We propose a communication-efficient distributed GNN training technique named $textLearn Locally, Correct Globally$ (LLCG)
LLCG trains a GNN on its local data by ignoring the dependency between nodes among different machines, then sends the locally trained model to the server for periodic model averaging.
We rigorously analyze the convergence of distributed methods with periodic model averaging for training GNNs and show that naively applying periodic model averaging but ignoring the dependency between nodes will suffer from an irreducible residual error.
arXiv Detail & Related papers (2021-11-16T03:07:01Z) - Local Stochastic Gradient Descent Ascent: Convergence Analysis and
Communication Efficiency [15.04034188283642]
Local SGD is a promising approach to overcome the communication overhead in distributed learning.
We show that local SGDA can provably optimize distributed minimax problems in both homogeneous and heterogeneous data.
arXiv Detail & Related papers (2021-02-25T20:15:18Z) - Federated Learning with Communication Delay in Edge Networks [5.500965885412937]
Federated learning has received significant attention as a potential solution for distributing machine learning (ML) model training through edge networks.
This work addresses an important consideration of federated learning at the network edge: communication delays between the edge nodes and the aggregator.
A technique called FedDelAvg (federated delayed averaging) is developed, which generalizes the standard federated averaging algorithm to incorporate a weighting between the current local model and the delayed global model received at each device during the synchronization step.
arXiv Detail & Related papers (2020-08-21T06:21:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.