Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with
Delayed Gradients
- URL: http://arxiv.org/abs/2102.06329v1
- Date: Fri, 12 Feb 2021 02:27:44 GMT
- Title: Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with
Delayed Gradients
- Authors: Xingyu Li, Zhe Qu, Bo Tang, Zhuo Lu
- Abstract summary: Federated learning (FL) is a new machine learning framework which trains a joint model across a large amount of decentralized computing devices.
This paper presents a novel FL algorithm, namely Hybrid Federated Learning (HFL), to achieve a learning balance in efficiency and effectiveness.
- Score: 21.63719641718363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a new machine learning framework which trains a
joint model across a large amount of decentralized computing devices. Existing
methods, e.g., Federated Averaging (FedAvg), are able to provide an
optimization guarantee by synchronously training the joint model, but usually
suffer from stragglers, i.e., IoT devices with low computing power or
communication bandwidth, especially on heterogeneous optimization problems. To
mitigate the influence of stragglers, this paper presents a novel FL algorithm,
namely Hybrid Federated Learning (HFL), to achieve a learning balance in
efficiency and effectiveness. It consists of two major components: synchronous
kernel and asynchronous updater. Unlike traditional synchronous FL methods, our
HFL introduces the asynchronous updater which actively pulls unsynchronized and
delayed local weights from stragglers. An adaptive approximation method,
Adaptive Delayed-SGD (AD-SGD), is proposed to merge the delayed local updates
into the joint model. The theoretical analysis of HFL shows that the
convergence rate of the proposed algorithm is $\mathcal{O}(\frac{1}{t+\tau})$
for both convex and non-convex optimization problems.
Related papers
- Robust Model Aggregation for Heterogeneous Federated Learning: Analysis and Optimizations [35.58487905412915]
We propose a time-driven SFL (T-SFL) framework for heterogeneous systems.
To evaluate the learning performance of T-SFL, we provide an upper bound on the global loss function.
We develop a discriminative model selection algorithm that removes local models from clients whose number of iterations falls below a predetermined threshold.
arXiv Detail & Related papers (2024-05-11T11:55:26Z) - Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning.
As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers.
We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z) - AEDFL: Efficient Asynchronous Decentralized Federated Learning with
Heterogeneous Devices [61.66943750584406]
We propose an Asynchronous Efficient Decentralized FL framework, i.e., AEDFL, in heterogeneous environments.
First, we propose an asynchronous FL system model with an efficient model aggregation method for improving the FL convergence.
Second, we propose a dynamic staleness-aware model update approach to achieve superior accuracy.
Third, we propose an adaptive sparse training method to reduce communication and computation costs without significant accuracy degradation.
arXiv Detail & Related papers (2023-12-18T05:18:17Z) - Vertical Federated Learning over Cloud-RAN: Convergence Analysis and
System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation.
We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions.
We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z) - Delay-Aware Hierarchical Federated Learning [7.292078085289465]
The paper introduces delay-aware hierarchical federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training.
During global synchronization, the cloud server consolidates local models with an outdated global model using a convex control algorithm.
Numerical evaluations show DFL's superior performance in terms of faster global model, reduced convergence resource, and evaluations against communication delays.
arXiv Detail & Related papers (2023-03-22T09:23:29Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Predictive GAN-powered Multi-Objective Optimization for Hybrid Federated
Split Learning [56.125720497163684]
We propose a hybrid federated split learning framework in wireless networks.
We design a parallel computing scheme for model splitting without label sharing, and theoretically analyze the influence of the delayed gradient caused by the scheme on the convergence speed.
arXiv Detail & Related papers (2022-09-02T10:29:56Z) - Time-triggered Federated Learning over Wireless Networks [48.389824560183776]
We present a time-triggered FL algorithm (TT-Fed) over wireless networks.
Our proposed TT-Fed algorithm improves the converged test accuracy by up to 12.5% and 5%, respectively.
arXiv Detail & Related papers (2022-04-26T16:37:29Z) - Resource-Efficient and Delay-Aware Federated Learning Design under Edge
Heterogeneity [10.702853653891902]
Federated learning (FL) has emerged as a popular methodology for distributing machine learning across wireless edge devices.
In this work, we consider optimizing the tradeoff between model performance and resource utilization in FL.
Our proposed StoFedDelAv incorporates a localglobal model combiner into the FL computation step.
arXiv Detail & Related papers (2021-12-27T22:30:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.