Knowledge Distillation for Federated Learning: a Practical Guide
- URL: http://arxiv.org/abs/2211.04742v1
- Date: Wed, 9 Nov 2022 08:31:23 GMT
- Title: Knowledge Distillation for Federated Learning: a Practical Guide
- Authors: Alessio Mora, Irene Tenison, Paolo Bellavista, Irina Rish
- Abstract summary: Federated Learning (FL) enables the training of Deep Learning models without centrally collecting possibly sensitive raw data.
The most used algorithms for FL are parameter-averaging based schemes (e.g., Federated Averaging) that, however, have well known limits.
We provide a review of KD-based algorithms tailored for specific FL issues.
- Score: 8.2791533759453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) enables the training of Deep Learning models without
centrally collecting possibly sensitive raw data. This paves the way for
stronger privacy guarantees when building predictive models. The most used
algorithms for FL are parameter-averaging based schemes (e.g., Federated
Averaging) that, however, have well known limits: (i) Clients must implement
the same model architecture; (ii) Transmitting model weights and model updates
implies high communication cost, which scales up with the number of model
parameters; (iii) In presence of non-IID data distributions,
parameter-averaging aggregation schemes perform poorly due to client model
drifts. Federated adaptations of regular Knowledge Distillation (KD) can solve
and/or mitigate the weaknesses of parameter-averaging FL algorithms while
possibly introducing other trade-offs. In this article, we provide a review of
KD-based algorithms tailored for specific FL issues.
Related papers
- SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction.
SMILE allows for the upscaling of source models into an MoE model without extra data or further training.
We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z) - Promoting Data and Model Privacy in Federated Learning through Quantized LoRA [41.81020951061438]
We introduce a method that just needs to distribute a quantized version of the model's parameters during training.
We combine this quantization strategy with LoRA, a popular and parameter-efficient fine-tuning method, to significantly reduce communication costs in federated learning.
The proposed framework, named textscFedLPP, successfully ensures both data and model privacy in the federated learning context.
arXiv Detail & Related papers (2024-06-16T15:23:07Z) - FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data.
Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena.
We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z) - Bridging Data Barriers among Participants: Assessing the Potential of Geoenergy through Federated Learning [2.8498944632323755]
This study introduces a novel federated learning (FL) framework based on XGBoost models.
FL models demonstrate superior accuracy and generalization capabilities compared to separate models.
This study opens new avenues for assessing unconventional reservoirs through collaborative and privacy-preserving FL techniques.
arXiv Detail & Related papers (2024-04-29T09:12:31Z) - Federated Bayesian Deep Learning: The Application of Statistical Aggregation Methods to Bayesian Models [0.9940108090221528]
Aggregation strategies have been developed to pool or fuse the weights and biases of distributed deterministic models.
We show that simple application of the aggregation methods associated with FL schemes for deterministic models is either impossible or results in sub-optimal performance.
arXiv Detail & Related papers (2024-03-22T15:02:24Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - Deep Equilibrium Models Meet Federated Learning [71.57324258813675]
This study explores the problem of Federated Learning (FL) by utilizing the Deep Equilibrium (DEQ) models instead of conventional deep learning networks.
We claim that incorporating DEQ models into the federated learning framework naturally addresses several open problems in FL.
To the best of our knowledge, this study is the first to establish a connection between DEQ models and federated learning.
arXiv Detail & Related papers (2023-05-29T22:51:40Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - Online Hyperparameter Optimization for Class-Incremental Learning [99.70569355681174]
Class-incremental learning (CIL) aims to train a classification model while the number of classes increases phase-by-phase.
An inherent challenge of CIL is the stability-plasticity tradeoff, i.e., CIL models should keep stable to retain old knowledge and keep plastic to absorb new knowledge.
We propose an online learning method that can adaptively optimize the tradeoff without knowing the setting as a priori.
arXiv Detail & Related papers (2023-01-11T17:58:51Z) - Do Gradient Inversion Attacks Make Federated Learning Unsafe? [70.0231254112197]
Federated learning (FL) allows the collaborative training of AI models without needing to share raw data.
Recent works on the inversion of deep neural networks from model gradients raised concerns about the security of FL in preventing the leakage of training data.
In this work, we show that these attacks presented in the literature are impractical in real FL use-cases and provide a new baseline attack.
arXiv Detail & Related papers (2022-02-14T18:33:12Z) - Efficient Federated Learning for AIoT Applications Using Knowledge
Distillation [2.5892786553124085]
Federated Learning (FL) trains a central model with decentralized data without compromising user privacy.
Traditional FL suffers from model inaccuracy since it trains local models using hard labels of data.
This paper presents a novel Distillation-based Federated Learning architecture that enables efficient and accurate FL for AIoT applications.
arXiv Detail & Related papers (2021-11-29T06:40:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.