Adaptive Gradient Sparsification for Efficient Federated Learning: An
Online Learning Approach
- URL: http://arxiv.org/abs/2001.04756v3
- Date: Fri, 20 Mar 2020 16:34:48 GMT
- Title: Adaptive Gradient Sparsification for Efficient Federated Learning: An
Online Learning Approach
- Authors: Pengchao Han, Shiqiang Wang, Kin K. Leung
- Abstract summary: Federated learning (FL) is an emerging technique for training machine learning models using geographically dispersed data.
gradient sparsification (GS) can be applied, where instead of the full gradient, only a small subset of important elements of the gradient is communicated.
We propose a novel online learning formulation and algorithm for automatically determining the near-optimal communication and trade-off.
- Score: 11.986523531539165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is an emerging technique for training machine
learning models using geographically dispersed data collected by local
entities. It includes local computation and synchronization steps. To reduce
the communication overhead and improve the overall efficiency of FL, gradient
sparsification (GS) can be applied, where instead of the full gradient, only a
small subset of important elements of the gradient is communicated. Existing
work on GS uses a fixed degree of gradient sparsity for i.i.d.-distributed data
within a datacenter. In this paper, we consider adaptive degree of sparsity and
non-i.i.d. local datasets. We first present a fairness-aware GS method which
ensures that different clients provide a similar amount of updates. Then, with
the goal of minimizing the overall training time, we propose a novel online
learning formulation and algorithm for automatically determining the
near-optimal communication and computation trade-off that is controlled by the
degree of gradient sparsity. The online learning algorithm uses an estimated
sign of the derivative of the objective function, which gives a regret bound
that is asymptotically equal to the case where exact derivative is available.
Experiments with real datasets confirm the benefits of our proposed approaches,
showing up to $40\%$ improvement in model accuracy for a finite training time.
Related papers
- On the Convergence of Continual Federated Learning Using Incrementally Aggregated Gradients [2.2530496464901106]
The holy grail of machine learning is to enable Continual Federated Learning (CFL) to enhance the efficiency, privacy, and scalability of AI systems while learning from streaming data.
We propose a novel replay-memory based federated strategy consisting of edge-based gradient updates on memory and aggregated gradients on the current data.
We empirically show that C-FLAG outperforms several state-of-the-art baselines on both task and class-incremental settings with respect to metrics such as accuracy and forgetting.
arXiv Detail & Related papers (2024-11-12T17:36:20Z) - Gradient-Congruity Guided Federated Sparse Training [31.793271982853188]
Federated learning (FL) is a distributed machine learning technique that facilitates this process while preserving data privacy.
FL also faces challenges such as high computational and communication costs regarding resource-constrained devices.
We propose the Gradient-Congruity Guided Federated Sparse Training (FedSGC), a novel method that integrates dynamic sparse training and gradient congruity inspection into federated learning framework.
arXiv Detail & Related papers (2024-05-02T11:29:48Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - Online Distributed Learning with Quantized Finite-Time Coordination [0.4910937238451484]
In our setting a set of agents need to cooperatively train a learning model from streaming data.
We propose a distributed algorithm that relies on a quantized, finite-time coordination protocol.
We analyze the performance of the proposed algorithm in terms of the mean distance from the online solution.
arXiv Detail & Related papers (2023-07-13T08:36:15Z) - FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted
Dual Averaging [104.41634756395545]
Federated learning (FL) is an emerging learning paradigm to tackle massively distributed data.
We propose textbfFedDA, a novel framework for local adaptive gradient methods.
We show that textbfFedDA-MVR is the first adaptive FL algorithm that achieves this rate.
arXiv Detail & Related papers (2023-02-13T05:10:30Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Delving into Effective Gradient Matching for Dataset Condensation [13.75957901381024]
gradient matching method directly targets the training dynamics by matching the gradient when training on the original and synthetic datasets.
We propose to match the multi-level gradients to involve both intra-class and inter-class gradient information.
An overfitting-aware adaptive learning step strategy is also proposed to trim unnecessary optimization steps for algorithmic efficiency improvement.
arXiv Detail & Related papers (2022-07-30T21:31:10Z) - sqSGD: Locally Private and Communication Efficient Federated Learning [14.60645909629309]
Federated learning (FL) is a technique that trains machine learning models from decentralized data sources.
We develop a gradient-based learning algorithm called sqSGD that addresses communication efficiency and high-dimensional compatibility.
Experiment results show sqSGD successfully learns large models like LeNet and ResNet with local privacy constraints.
arXiv Detail & Related papers (2022-06-21T17:45:35Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.