Communication-Efficient Federated Distillation with Active Data Sampling
- URL: http://arxiv.org/abs/2203.06900v1
- Date: Mon, 14 Mar 2022 07:50:55 GMT
- Title: Communication-Efficient Federated Distillation with Active Data Sampling
- Authors: Lumin Liu, Jun Zhang, S. H. Song, Khaled B. Letaief
- Abstract summary: Federated learning (FL) is a promising paradigm to enable privacy-preserving deep learning from distributed data.
Federated Distillation (FD) is a recently proposed alternative to enable communication-efficient and robust FL.
This paper presents a generic meta-algorithm for FD and investigate the influence of key parameters through empirical experiments.
We propose a communication-efficient FD algorithm with active data sampling to improve the model performance and reduce the communication overhead.
- Score: 6.516631577963641
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) is a promising paradigm to enable privacy-preserving
deep learning from distributed data. Most previous works are based on federated
average (FedAvg), which, however, faces several critical issues, including a
high communication overhead and the difficulty in dealing with heterogeneous
model architectures. Federated Distillation (FD) is a recently proposed
alternative to enable communication-efficient and robust FL, which achieves
orders of magnitude reduction of the communication overhead compared with
FedAvg and is flexible to handle heterogeneous models at the clients. However,
so far there is no unified algorithmic framework or theoretical analysis for
FD-based methods. In this paper, we first present a generic meta-algorithm for
FD and investigate the influence of key parameters through empirical
experiments. Then, we verify the empirical observations theoretically. Based on
the empirical results and theory, we propose a communication-efficient FD
algorithm with active data sampling to improve the model performance and reduce
the communication overhead. Empirical simulations on benchmark datasets will
demonstrate that our proposed algorithm effectively and significantly reduces
the communication overhead while achieving a satisfactory performance.
Related papers
- Boosting the Performance of Decentralized Federated Learning via Catalyst Acceleration [66.43954501171292]
We introduce Catalyst Acceleration and propose an acceleration Decentralized Federated Learning algorithm called DFedCata.
DFedCata consists of two main components: the Moreau envelope function, which addresses parameter inconsistencies, and Nesterov's extrapolation step, which accelerates the aggregation phase.
Empirically, we demonstrate the advantages of the proposed algorithm in both convergence speed and generalization performance on CIFAR10/100 with various non-iid data distributions.
arXiv Detail & Related papers (2024-10-09T06:17:16Z) - FedFT: Improving Communication Performance for Federated Learning with Frequency Space Transformation [0.361593752383807]
We introduce FedFT (federated frequency-space transformation), a simple yet effective methodology for communicating model parameters in a Federated Learning setting.
FedFT uses Discrete Cosine Transform (DCT) to represent model parameters in frequency space, enabling efficient compression and reducing communication overhead.
We demonstrate the generalisability of the FedFT methodology on four datasets using comparative studies with three state-of-the-art FL baselines.
arXiv Detail & Related papers (2024-09-08T23:05:35Z) - DynamicFL: Federated Learning with Dynamic Communication Resource Allocation [34.97472382870816]
Federated Learning (FL) is a collaborative machine learning framework that allows multiple users to train models utilizing their local data in a distributed manner.
We introduce DynamicFL, a new FL framework that investigates the trade-offs between global model performance and communication costs.
We show that DynamicFL surpasses current state-of-the-art methods with up to a 10% increase in model accuracy.
arXiv Detail & Related papers (2024-09-08T05:53:32Z) - Over-the-Air Federated Learning and Optimization [52.5188988624998]
We focus on Federated learning (FL) via edge-the-air computation (AirComp)
We describe the convergence of AirComp-based FedAvg (AirFedAvg) algorithms under both convex and non- convex settings.
For different types of local updates that can be transmitted by edge devices (i.e., model, gradient, model difference), we reveal that transmitting in AirFedAvg may cause an aggregation error.
In addition, we consider more practical signal processing schemes to improve the communication efficiency and extend the convergence analysis to different forms of model aggregation error caused by these signal processing schemes.
arXiv Detail & Related papers (2023-10-16T05:49:28Z) - FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal
Heterogeneous Federated Learning [37.96957782129352]
We propose a finetuning framework tailored to heterogeneous multi-modal foundation models, called Federated Dual-Aadapter Teacher (Fed DAT)
Fed DAT addresses data heterogeneity by regularizing the client local updates and applying Mutual Knowledge Distillation (MKD) for an efficient knowledge transfer.
To demonstrate its effectiveness, we conduct extensive experiments on four multi-modality FL benchmarks with different types of data heterogeneity.
arXiv Detail & Related papers (2023-08-21T21:57:01Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Local Learning Matters: Rethinking Data Heterogeneity in Federated
Learning [61.488646649045215]
Federated learning (FL) is a promising strategy for performing privacy-preserving, distributed learning with a network of clients (i.e., edge devices)
arXiv Detail & Related papers (2021-11-28T19:03:39Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Communication-Efficient Federated Learning with Compensated
Overlap-FedAvg [22.636184975591004]
Federated learning is proposed to perform model training by multiple clients' combined data without the dataset sharing within the cluster.
We propose Overlap-FedAvg, a framework that parallels the model training phase with model uploading & downloading phase.
Overlap-FedAvg is further developed with a hierarchical computing strategy, a data compensation mechanism and a nesterov accelerated gradients(NAG) algorithm.
arXiv Detail & Related papers (2020-12-12T02:50:09Z) - Communication-Efficient Federated Distillation [14.10627556244287]
Communication constraints are one of the major challenges preventing the wide-spread adoption of Federated Learning systems.
Recently, Federated Distillation (FD), a new algorithmic paradigm for Federated Learning, emerged.
FD methods leverage ensemble distillation techniques and exchange model outputs, presented as soft labels on an unlabeled public data set.
arXiv Detail & Related papers (2020-12-01T16:57:25Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.