Related papers: Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with Delayed Gradients

Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with Delayed Gradients

URL: http://arxiv.org/abs/2102.06329v1
Date: Fri, 12 Feb 2021 02:27:44 GMT
Title: Stragglers Are Not Disaster: A Hybrid Federated Learning Algorithm with Delayed Gradients
Authors: Xingyu Li, Zhe Qu, Bo Tang, Zhuo Lu
Abstract summary: Federated learning (FL) is a new machine learning framework which trains a joint model across a large amount of decentralized computing devices. This paper presents a novel FL algorithm, namely Hybrid Federated Learning (HFL), to achieve a learning balance in efficiency and effectiveness.
Score: 21.63719641718363
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning (FL) is a new machine learning framework which trains a joint model across a large amount of decentralized computing devices. Existing methods, e.g., Federated Averaging (FedAvg), are able to provide an optimization guarantee by synchronously training the joint model, but usually suffer from stragglers, i.e., IoT devices with low computing power or communication bandwidth, especially on heterogeneous optimization problems. To mitigate the influence of stragglers, this paper presents a novel FL algorithm, namely Hybrid Federated Learning (HFL), to achieve a learning balance in efficiency and effectiveness. It consists of two major components: synchronous kernel and asynchronous updater. Unlike traditional synchronous FL methods, our HFL introduces the asynchronous updater which actively pulls unsynchronized and delayed local weights from stragglers. An adaptive approximation method, Adaptive Delayed-SGD (AD-SGD), is proposed to merge the delayed local updates into the joint model. The theoretical analysis of HFL shows that the convergence rate of the proposed algorithm is $\mathcal{O}(\frac{1}{t+\tau})$ for both convex and non-convex optimization problems.

Related papers

Adaptive Deadline and Batch Layered Synchronized Federated Learning [66.93447103966439]
Federated learning (FL) enables collaborative model training across distributed edge devices while preserving data privacy, and typically operates in a round-based synchronous manner.<n>We propose ADEL-FL, a novel framework that jointly optimize per-round deadlines and user-specific batch sizes for layer-wise aggregation.
arXiv Detail & Related papers (2025-05-29T19:59:18Z)
Biased Federated Learning under Wireless Heterogeneity [7.3716675761469945]
Federated learning (FL) is a promising framework for computation, enabling collaborative model training without sharing private data. Existing wireless computation works primarily adopt two communication strategies: (1) over-the-air (OTA) which exploits wireless signal superposition, and (2) over-the-air (OTA) which allocates resources for convergence. This paper proposes novel OTA and digital FL updates that allow a structured, time-in-place bias, thereby reducing variance in FL updates.
arXiv Detail & Related papers (2025-03-08T05:55:14Z)
Robust Model Aggregation for Heterogeneous Federated Learning: Analysis and Optimizations [35.58487905412915]
We propose a time-driven SFL (T-SFL) framework for heterogeneous systems. To evaluate the learning performance of T-SFL, we provide an upper bound on the global loss function. We develop a discriminative model selection algorithm that removes local models from clients whose number of iterations falls below a predetermined threshold.
arXiv Detail & Related papers (2024-05-11T11:55:26Z)
Stragglers-Aware Low-Latency Synchronous Federated Learning via Layer-Wise Model Updates [71.81037644563217]
Synchronous federated learning (FL) is a popular paradigm for collaborative edge learning. As some of the devices may have limited computational resources and varying availability, FL latency is highly sensitive to stragglers. We propose straggler-aware layer-wise federated learning (SALF) that leverages the optimization procedure of NNs via backpropagation to update the global model in a layer-wise fashion.
arXiv Detail & Related papers (2024-03-27T09:14:36Z)
AEDFL: Efficient Asynchronous Decentralized Federated Learning with Heterogeneous Devices [61.66943750584406]
We propose an Asynchronous Efficient Decentralized FL framework, i.e., AEDFL, in heterogeneous environments. First, we propose an asynchronous FL system model with an efficient model aggregation method for improving the FL convergence. Second, we propose a dynamic staleness-aware model update approach to achieve superior accuracy. Third, we propose an adaptive sparse training method to reduce communication and computation costs without significant accuracy degradation.
arXiv Detail & Related papers (2023-12-18T05:18:17Z)
Vertical Federated Learning over Cloud-RAN: Convergence Analysis and System Optimization [82.12796238714589]
We propose a novel cloud radio access network (Cloud-RAN) based vertical FL system to enable fast and accurate model aggregation. We characterize the convergence behavior of the vertical FL algorithm considering both uplink and downlink transmissions. We establish a system optimization framework by joint transceiver and fronthaul quantization design, for which successive convex approximation and alternate convex search based system optimization algorithms are developed.
arXiv Detail & Related papers (2023-05-04T09:26:03Z)
Delay-Aware Hierarchical Federated Learning [7.292078085289465]
The paper introduces delay-aware hierarchical federated learning (DFL) to improve the efficiency of distributed machine learning (ML) model training. During global synchronization, the cloud server consolidates local models with an outdated global model using a convex control algorithm. Numerical evaluations show DFL's superior performance in terms of faster global model, reduced convergence resource, and evaluations against communication delays.
arXiv Detail & Related papers (2023-03-22T09:23:29Z)
Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data. In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z)
Predictive GAN-powered Multi-Objective Optimization for Hybrid Federated Split Learning [56.125720497163684]
We propose a hybrid federated split learning framework in wireless networks. We design a parallel computing scheme for model splitting without label sharing, and theoretically analyze the influence of the delayed gradient caused by the scheme on the convergence speed.
arXiv Detail & Related papers (2022-09-02T10:29:56Z)
Time-triggered Federated Learning over Wireless Networks [48.389824560183776]
We present a time-triggered FL algorithm (TT-Fed) over wireless networks. Our proposed TT-Fed algorithm improves the converged test accuracy by up to 12.5% and 5%, respectively.
arXiv Detail & Related papers (2022-04-26T16:37:29Z)
Resource-Efficient and Delay-Aware Federated Learning Design under Edge Heterogeneity [10.702853653891902]
Federated learning (FL) has emerged as a popular methodology for distributing machine learning across wireless edge devices. In this work, we consider optimizing the tradeoff between model performance and resource utilization in FL. Our proposed StoFedDelAv incorporates a localglobal model combiner into the FL computation step.
arXiv Detail & Related papers (2021-12-27T22:30:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.