DRAG: Divergence-based Adaptive Aggregation in Federated learning on
Non-IID Data
- URL: http://arxiv.org/abs/2309.01779v1
- Date: Mon, 4 Sep 2023 19:40:58 GMT
- Title: DRAG: Divergence-based Adaptive Aggregation in Federated learning on
Non-IID Data
- Authors: Feng Zhu, Jingjing Zhang, Shengyun Liu and Xin Wang
- Abstract summary: Local gradient descent (SGD) is a fundamental approach in achieving communication efficiency in Federated Learning (FL)
We introduce a novel metric called degree of divergence," quantifying the angle between the local gradient and the global reference direction.
We propose the divergence-based adaptive aggregation (DRAG) algorithm, which dynamically drags" the received local updates toward the reference direction in each round without requiring extra communication overhead.
- Score: 11.830891255837788
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Local stochastic gradient descent (SGD) is a fundamental approach in
achieving communication efficiency in Federated Learning (FL) by allowing
individual workers to perform local updates. However, the presence of
heterogeneous data distributions across working nodes causes each worker to
update its local model towards a local optimum, leading to the phenomenon known
as ``client-drift" and resulting in slowed convergence. To address this issue,
previous works have explored methods that either introduce communication
overhead or suffer from unsteady performance. In this work, we introduce a
novel metric called ``degree of divergence," quantifying the angle between the
local gradient and the global reference direction. Leveraging this metric, we
propose the divergence-based adaptive aggregation (DRAG) algorithm, which
dynamically ``drags" the received local updates toward the reference direction
in each round without requiring extra communication overhead. Furthermore, we
establish a rigorous convergence analysis for DRAG, proving its ability to
achieve a sublinear convergence rate. Compelling experimental results are
presented to illustrate DRAG's superior performance compared to
state-of-the-art algorithms in effectively managing the client-drift
phenomenon. Additionally, DRAG exhibits remarkable resilience against certain
Byzantine attacks. By securely sharing a small sample of the client's data with
the FL server, DRAG effectively counters these attacks, as demonstrated through
comprehensive experiments.
Related papers
- Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - Decentralized Federated Learning with Gradient Tracking over Time-Varying Directed Networks [42.92231921732718]
We propose a consensus-based algorithm called DSGTm-TV.
It incorporates gradient tracking and heavy-ball momentum to optimize a global objective function.
Under DSGTm-TV, agents will update local model parameters and gradient estimates using information exchange with neighboring agents.
arXiv Detail & Related papers (2024-09-25T06:23:16Z) - Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients")
Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift")
We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z) - FedImpro: Measuring and Improving Client Update in Federated Learning [77.68805026788836]
Federated Learning (FL) models often experience client drift caused by heterogeneous data.
We present an alternative perspective on client drift and aim to mitigate it by generating improved local models.
arXiv Detail & Related papers (2024-02-10T18:14:57Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Momentum Benefits Non-IID Federated Learning Simply and Provably [22.800862422479913]
Federated learning is a powerful paradigm for large-scale machine learning.
FedAvg and SCAFFOLD are two prominent algorithms to address these challenges.
This paper explores the utilization of momentum to enhance the performance of FedAvg and SCAFFOLD.
arXiv Detail & Related papers (2023-06-28T18:52:27Z) - FedAgg: Adaptive Federated Learning with Aggregated Gradients [1.5653612447564105]
We propose an adaptive FEDerated learning algorithm called FedAgg to alleviate the divergence between the local and average model parameters and obtain a fast model convergence rate.
We show that our framework is superior to existing state-of-the-art FL strategies for enhancing model performance and accelerating convergence rate under IID and Non-IID datasets.
arXiv Detail & Related papers (2023-03-28T08:07:28Z) - Magnitude Matters: Fixing SIGNSGD Through Magnitude-Aware Sparsification
in the Presence of Data Heterogeneity [60.791736094073]
Communication overhead has become one of the major bottlenecks in the distributed training of deep neural networks.
We propose a magnitude-driven sparsification scheme, which addresses the non-convergence issue of SIGNSGD.
The proposed scheme is validated through experiments on Fashion-MNIST, CIFAR-10, and CIFAR-100 datasets.
arXiv Detail & Related papers (2023-02-19T17:42:35Z) - On the effectiveness of partial variance reduction in federated learning
with heterogeneous data [27.527995694042506]
We show that the diversity of the final classification layers across clients impedes the performance of the FedAvg algorithm.
Motivated by this, we propose to correct model by variance reduction only on the final layers.
We demonstrate that this significantly outperforms existing benchmarks at a similar or lower communication cost.
arXiv Detail & Related papers (2022-12-05T11:56:35Z) - Divide and Contrast: Source-free Domain Adaptation via Adaptive
Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations.
DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals.
We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z) - FedDC: Federated Learning with Non-IID Data via Local Drift Decoupling
and Correction [48.85303253333453]
Federated learning (FL) allows multiple clients to collectively train a high-performance global model without sharing their private data.
We propose a novel federated learning algorithm with local drift decoupling and correction (FedDC)
Our FedDC only introduces lightweight modifications in the local training phase, in which each client utilizes an auxiliary local drift variable to track the gap between the local model parameter and the global model parameters.
Experiment results and analysis demonstrate that FedDC yields expediting convergence and better performance on various image classification tasks.
arXiv Detail & Related papers (2022-03-22T14:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.