FedKD: Communication Efficient Federated Learning via Knowledge
Distillation
- URL: http://arxiv.org/abs/2108.13323v1
- Date: Mon, 30 Aug 2021 15:39:54 GMT
- Title: FedKD: Communication Efficient Federated Learning via Knowledge
Distillation
- Authors: Chuhan Wu, Fangzhao Wu, Ruixuan Liu, Lingjuan Lyu, Yongfeng Huang,
Xing Xie
- Abstract summary: Federated learning is widely used to learn intelligent models from decentralized data.
In federated learning, clients need to communicate their local model updates in each iteration of model learning.
We propose a communication efficient federated learning method based on knowledge distillation.
- Score: 56.886414139084216
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Federated learning is widely used to learn intelligent models from
decentralized data. In federated learning, clients need to communicate their
local model updates in each iteration of model learning. However, model updates
are large in size if the model contains numerous parameters, and there usually
needs many rounds of communication until model converges. Thus, the
communication cost in federated learning can be quite heavy. In this paper, we
propose a communication efficient federated learning method based on knowledge
distillation. Instead of directly communicating the large models between
clients and server, we propose an adaptive mutual distillation framework to
reciprocally learn a student and a teacher model on each client, where only the
student model is shared by different clients and updated collaboratively to
reduce the communication cost. Both the teacher and student on each client are
learned on its local data and the knowledge distilled from each other, where
their distillation intensities are controlled by their prediction quality. To
further reduce the communication cost, we propose a dynamic gradient
approximation method based on singular value decomposition to approximate the
exchanged gradients with dynamic precision. Extensive experiments on benchmark
datasets in different tasks show that our approach can effectively reduce the
communication cost and achieve competitive results.
Related papers
- Update Selective Parameters: Federated Machine Unlearning Based on Model Explanation [46.86767774669831]
We propose a more effective and efficient federated unlearning scheme based on the concept of model explanation.
We select the most influential channels within an already-trained model for the data that need to be unlearned.
arXiv Detail & Related papers (2024-06-18T11:43:20Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - SalientGrads: Sparse Models for Communication Efficient and Data Aware
Distributed Federated Training [1.0413504599164103]
Federated learning (FL) enables the training of a model leveraging decentralized data in client sites while preserving privacy by not collecting data.
One of the significant challenges of FL is limited computation and low communication bandwidth in resource limited edge client nodes.
We propose Salient Grads, which simplifies the process of sparse training by choosing a data aware subnetwork before training.
arXiv Detail & Related papers (2023-04-15T06:46:37Z) - Personalizing Federated Learning with Over-the-Air Computations [84.8089761800994]
Federated edge learning is a promising technology to deploy intelligence at the edge of wireless networks in a privacy-preserving manner.
Under such a setting, multiple clients collaboratively train a global generic model under the coordination of an edge server.
This paper presents a distributed training paradigm that employs analog over-the-air computation to address the communication bottleneck.
arXiv Detail & Related papers (2023-02-24T08:41:19Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Meta Knowledge Condensation for Federated Learning [65.20774786251683]
Existing federated learning paradigms usually extensively exchange distributed models at a central solver to achieve a more powerful model.
This would incur severe communication burden between a server and multiple clients especially when data distributions are heterogeneous.
Unlike existing paradigms, we introduce an alternative perspective to significantly decrease the communication cost in federate learning.
arXiv Detail & Related papers (2022-09-29T15:07:37Z) - Federated Learning of Neural ODE Models with Different Iteration Counts [0.9444784653236158]
Federated learning is a distributed machine learning approach in which clients train models locally with their own data and upload them to a server so that their trained results are shared between them without uploading raw data to the server.
In this paper, we utilize Neural ODE based models for federated learning.
We show that our approach can reduce communication size by up to 92.4% compared with a baseline ResNet model using CIFAR-10 dataset.
arXiv Detail & Related papers (2022-08-19T17:57:32Z) - FedLite: A Scalable Approach for Federated Learning on
Resource-constrained Clients [41.623518032533035]
In split learning, only a small part of the model is stored and trained on clients while the remaining large part of the model only stays at the servers.
This paper addresses this issue by compressing the additional communication using a novel clustering scheme accompanied by a gradient correction method.
arXiv Detail & Related papers (2022-01-28T00:09:53Z) - CosSGD: Nonlinear Quantization for Communication-efficient Federated
Learning [62.65937719264881]
Federated learning facilitates learning across clients without transferring local data on these clients to a central server.
We propose a nonlinear quantization for compressed gradient descent, which can be easily utilized in federated learning.
Our system significantly reduces the communication cost by up to three orders of magnitude, while maintaining convergence and accuracy of the training process.
arXiv Detail & Related papers (2020-12-15T12:20:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.