Double Momentum SGD for Federated Learning
- URL: http://arxiv.org/abs/2102.03970v1
- Date: Mon, 8 Feb 2021 02:47:24 GMT
- Title: Double Momentum SGD for Federated Learning
- Authors: An Xu, Heng Huang
- Abstract summary: We propose a new SGD variant named as DOMO to improve the model performance in federated learning.
One momentum buffer tracks the server update direction, while the other tracks the local update direction.
We introduce a novel server momentum fusion technique to coordinate the server and local momentum SGD.
- Score: 94.58442574293021
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Communication efficiency is crucial in federated learning. Conducting many
local training steps in clients to reduce the communication frequency between
clients and the server is a common method to address this issue. However, the
client drift problem arises as the non-i.i.d. data distributions in different
clients can severely deteriorate the performance of federated learning. In this
work, we propose a new SGD variant named as DOMO to improve the model
performance in federated learning, where double momentum buffers are
maintained. One momentum buffer tracks the server update direction, while the
other tracks the local update direction. We introduce a novel server momentum
fusion technique to coordinate the server and local momentum SGD. We also
provide the first theoretical analysis involving both the server and local
momentum SGD. Extensive experimental results show a better model performance of
DOMO than FedAvg and existing momentum SGD variants in federated learning
tasks.
Related papers
- FedMoE-DA: Federated Mixture of Experts via Domain Aware Fine-grained Aggregation [22.281467168796645]
Federated learning (FL) is a collaborative machine learning approach that enables multiple clients to train models without sharing their private data.
We propose FedMoE-DA, a new FL model training framework that incorporates a novel domain-aware, fine-grained aggregation strategy to enhance the robustness, personalizability, and communication efficiency simultaneously.
arXiv Detail & Related papers (2024-11-04T14:29:04Z) - Achieving Linear Speedup in Asynchronous Federated Learning with
Heterogeneous Clients [30.135431295658343]
Federated learning (FL) aims to learn a common global model without exchanging or transferring the data that are stored locally at different clients.
In this paper, we propose an efficient federated learning (AFL) framework called DeFedAvg.
DeFedAvg is the first AFL algorithm that achieves the desirable linear speedup property, which indicates its high scalability.
arXiv Detail & Related papers (2024-02-17T05:22:46Z) - Scheduling and Communication Schemes for Decentralized Federated
Learning [0.31410859223862103]
A decentralized federated learning (DFL) model with the gradient descent (SGD) algorithm has been introduced.
Three scheduling policies for DFL have been proposed for communications between the clients and the parallel servers.
Results show that the proposed scheduling polices have an impact both on the speed of convergence and in the final global model.
arXiv Detail & Related papers (2023-11-27T17:35:28Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Joint Client Assignment and UAV Route Planning for
Indirect-Communication Federated Learning [20.541942109704987]
A new framework called FedEx (Federated Learning via Model Express Delivery) is proposed.
It employs mobile transporters, such as UAVs, to establish indirect communication channels between the server and clients.
Two algorithms, FedEx-Sync and FedEx-Async, are proposed for synchronized and asynchronized learning at the transporter level.
arXiv Detail & Related papers (2023-04-21T04:47:54Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling
and Correction [48.85303253333453]
Federated learning (FL) allows multiple clients to collectively train a high-performance global model without sharing their private data.
We propose a novel federated learning algorithm with local drift decoupling and correction (FedDC)
Our FedDC only introduces lightweight modifications in the local training phase, in which each client utilizes an auxiliary local drift variable to track the gap between the local model parameter and the global model parameters.
Experiment results and analysis demonstrate that FedDC yields expediting convergence and better performance on various image classification tasks.
arXiv Detail & Related papers (2022-03-22T14:06:26Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Byzantine-robust Federated Learning through Spatial-temporal Analysis of
Local Model Updates [6.758334200305236]
Federated Learning (FL) enables multiple distributed clients (e.g., mobile devices) to collaboratively train a centralized model while keeping the training data locally on the client.
In this paper, we propose to mitigate these failures and attacks from a spatial-temporal perspective.
Specifically, we use a clustering-based method to detect and exclude incorrect updates by leveraging their geometric properties in the parameter space.
arXiv Detail & Related papers (2021-07-03T18:48:11Z) - Low-Latency Federated Learning over Wireless Channels with Differential
Privacy [142.5983499872664]
In federated learning (FL), model training is distributed over clients and local models are aggregated by a central server.
In this paper, we aim to minimize FL training delay over wireless channels, constrained by overall training performance as well as each client's differential privacy (DP) requirement.
arXiv Detail & Related papers (2021-06-20T13:51:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.