Federated Learning on Non-iid Data via Local and Global Distillation
- URL: http://arxiv.org/abs/2306.14443v1
- Date: Mon, 26 Jun 2023 06:14:01 GMT
- Title: Federated Learning on Non-iid Data via Local and Global Distillation
- Authors: Xiaolin Zheng, Senci Ying, Fei Zheng, Jianwei Yin, Longfei Zheng,
Chaochao Chen, Fengqin Dong
- Abstract summary: We propose FedND: federated learning with noise distillation.
In the client, we propose a self-distillation method to train the local model.
In the server, we generate noisy samples for each client and use them to distill other clients.
Experimental results show that the algorithm achieves the best performance and is more communication-efficient than state-of-the-art methods.
- Score: 25.397058380098816
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most existing federated learning algorithms are based on the vanilla FedAvg
scheme. However, with the increase of data complexity and the number of model
parameters, the amount of communication traffic and the number of iteration
rounds for training such algorithms increases significantly, especially in
non-independently and homogeneously distributed scenarios, where they do not
achieve satisfactory performance. In this work, we propose FedND: federated
learning with noise distillation. The main idea is to use knowledge
distillation to optimize the model training process. In the client, we propose
a self-distillation method to train the local model. In the server, we generate
noisy samples for each client and use them to distill other clients. Finally,
the global model is obtained by the aggregation of local models. Experimental
results show that the algorithm achieves the best performance and is more
communication-efficient than state-of-the-art methods.
Related papers
- An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup
for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method.
We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate.
We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z) - FedSampling: A Better Sampling Strategy for Federated Learning [81.85411484302952]
Federated learning (FL) is an important technique for learning models from decentralized data in a privacy-preserving way.
Existing FL methods usually uniformly sample clients for local model learning in each round.
We propose a novel data uniform sampling strategy for federated learning (FedSampling)
arXiv Detail & Related papers (2023-06-25T13:38:51Z) - Adaptive Self-Distillation for Minimizing Client Drift in Heterogeneous
Federated Learning [9.975023463908496]
Federated Learning (FL) is a machine learning paradigm that enables clients to jointly train a global model by aggregating the locally trained models without sharing any local training data.
We propose a novel regularization technique based on adaptive self-distillation (ASD) for training models on the client side.
Our regularization scheme adaptively adjusts to the client's training data based on the global model entropy and the client's label distribution.
arXiv Detail & Related papers (2023-05-31T07:00:42Z) - SalientGrads: Sparse Models for Communication Efficient and Data Aware
Distributed Federated Training [1.0413504599164103]
Federated learning (FL) enables the training of a model leveraging decentralized data in client sites while preserving privacy by not collecting data.
One of the significant challenges of FL is limited computation and low communication bandwidth in resource limited edge client nodes.
We propose Salient Grads, which simplifies the process of sparse training by choosing a data aware subnetwork before training.
arXiv Detail & Related papers (2023-04-15T06:46:37Z) - One-shot Federated Learning without Server-side Training [42.59845771101823]
One-shot federated learning is gaining popularity as a way to reduce communication cost between clients and the server.
Most of the existing one-shot FL methods are based on Knowledge Distillation; however, distillation based approach requires an extra training phase and depends on publicly available data sets or generated pseudo samples.
In this work, we consider a novel and challenging cross-silo setting: performing a single round of parameter aggregation on the local models without server-side training.
arXiv Detail & Related papers (2022-04-26T01:45:37Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - FedKD: Communication Efficient Federated Learning via Knowledge
Distillation [56.886414139084216]
Federated learning is widely used to learn intelligent models from decentralized data.
In federated learning, clients need to communicate their local model updates in each iteration of model learning.
We propose a communication efficient federated learning method based on knowledge distillation.
arXiv Detail & Related papers (2021-08-30T15:39:54Z) - Adaptive Distillation for Decentralized Learning from Heterogeneous
Clients [9.261720698142097]
We propose a new decentralized learning method called Decentralized Learning via Adaptive Distillation (DLAD)
The proposed DLAD aggregates the outputs of the client models while adaptively emphasizing those with higher confidence in given distillation samples.
Our extensive experimental evaluation on multiple public datasets demonstrates the effectiveness of the proposed method.
arXiv Detail & Related papers (2020-08-18T14:25:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.