FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion
- URL: http://arxiv.org/abs/2410.20380v1
- Date: Sun, 27 Oct 2024 09:07:10 GMT
- Title: FuseFL: One-Shot Federated Learning through the Lens of Causality with Progressive Model Fusion
- Authors: Zhenheng Tang, Yonggang Zhang, Peijie Dong, Yiu-ming Cheung, Amelie Chi Zhou, Bo Han, Xiaowen Chu,
- Abstract summary: One-shot Federated Learning (OFL) significantly reduces communication costs in FL by aggregating trained models only once.
However, the performance of advanced OFL methods is far behind the normal FL.
We propose a novel learning approach to endow OFL with superb performance and low communication and storage costs, termed as FuseFL.
- Score: 48.90879664138855
- License:
- Abstract: One-shot Federated Learning (OFL) significantly reduces communication costs in FL by aggregating trained models only once. However, the performance of advanced OFL methods is far behind the normal FL. In this work, we provide a causal view to find that this performance drop of OFL methods comes from the isolation problem, which means that local isolatedly trained models in OFL may easily fit to spurious correlations due to the data heterogeneity. From the causal perspective, we observe that the spurious fitting can be alleviated by augmenting intermediate features from other clients. Built upon our observation, we propose a novel learning approach to endow OFL with superb performance and low communication and storage costs, termed as FuseFL. Specifically, FuseFL decomposes neural networks into several blocks, and progressively trains and fuses each block following a bottom-up manner for feature augmentation, introducing no additional communication costs. Comprehensive experiments demonstrate that FuseFL outperforms existing OFL and ensemble FL by a significant margin. We conduct comprehensive experiments to show that FuseFL supports high scalability of clients, heterogeneous model training, and low memory costs. Our work is the first attempt using causality to analyze and alleviate data heterogeneity of OFL.
Related papers
- Revisiting Ensembling in One-Shot Federated Learning [9.02411690527967]
One-shot federated learning (OFL) trades the iterative exchange of models between clients and the server with a single round of communication.
We introduce FENS, a novel federated ensembling scheme that approaches the accuracy of FL with the communication efficiency of OFL.
FENS achieves up to a 26.9% higher accuracy over state-of-the-art (SOTA) OFL, being only 3.1% lower than FL.
arXiv Detail & Related papers (2024-11-11T17:58:28Z) - OledFL: Unleashing the Potential of Decentralized Federated Learning via Opposite Lookahead Enhancement [21.440625995788974]
Decentralized Federated Learning (DFL) surpasses Federated Learning (CFL) in terms of faster training, privacy preservation, and light communication.
However, DFL still exhibits significant disparities with CFL in terms of generalization ability.
arXiv Detail & Related papers (2024-10-09T02:16:14Z) - R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML)
A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming.
This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding.
A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z) - SpaFL: Communication-Efficient Federated Learning with Sparse Models and Low computational Overhead [75.87007729801304]
SpaFL: a communication-efficient FL framework is proposed to optimize sparse model structures with low computational overhead.
Experiments show that SpaFL improves accuracy while requiring much less communication and computing resources compared to sparse baselines.
arXiv Detail & Related papers (2024-06-01T13:10:35Z) - Sharp Bounds for Sequential Federated Learning on Heterogeneous Data [5.872735527071425]
There are two paradigms in Learning (FL): parallel FL (PFL) and sequential FL (SFL)
In contrast to that of PFL, convergence theory SFL data is still lacking.
We derive the upper bounds for strongly convex, general convex and sequential non-counterintuitive objective functions.
We compare the upper bounds SFL with those on heterogeneous PFL data.
arXiv Detail & Related papers (2024-05-02T09:58:49Z) - Have Your Cake and Eat It Too: Toward Efficient and Accurate Split Federated Learning [25.47111107054497]
Split Federated Learning (SFL) is promising in AIoT systems.
SFL suffers from the challenges of low inference accuracy and low efficiency.
This paper presents a novel SFL approach, named Sliding Split Federated Learning (S$2$FL)
arXiv Detail & Related papers (2023-11-22T05:09:50Z) - Convergence Analysis of Sequential Federated Learning on Heterogeneous Data [5.872735527071425]
There are two categories of methods in Federated Learning (FL) for joint training across multiple clients: i) parallel FL (PFL), where clients train models in a parallel manner; and ii) FL (SFL) where clients train in a sequential manner.
In this paper, we establish the convergence guarantees SFL on heterogeneous data is still lacking.
Experimental results validate the counterintuitive analysis result that SFL outperforms PFL on extremely heterogeneous data in cross-device settings.
arXiv Detail & Related papers (2023-11-06T14:48:51Z) - Bayesian Federated Learning: A Survey [54.40136267717288]
Federated learning (FL) demonstrates its advantages in integrating distributed infrastructure, communication, computing and learning in a privacy-preserving manner.
The robustness and capabilities of existing FL methods are challenged by limited and dynamic data and conditions.
BFL has emerged as a promising approach to address these issues.
arXiv Detail & Related papers (2023-04-26T03:41:17Z) - Improving the Model Consistency of Decentralized Federated Learning [68.2795379609854]
Federated Learning (FL) discards the central server and each client only communicates with its neighbors in a decentralized communication network.
Existing DFL suffers from inconsistency among local clients, which results in inferior compared to FLFL.
We propose DFedSAMMGS, where $1lambda$ is the spectral gossip matrix and $Q$ is the number of sparse data gaps.
arXiv Detail & Related papers (2023-02-08T14:37:34Z) - Achieving Personalized Federated Learning with Sparse Local Models [75.76854544460981]
Federated learning (FL) is vulnerable to heterogeneously distributed data.
To counter this issue, personalized FL (PFL) was proposed to produce dedicated local models for each individual user.
Existing PFL solutions either demonstrate unsatisfactory generalization towards different model architectures or cost enormous extra computation and memory.
We proposeFedSpa, a novel PFL scheme that employs personalized sparse masks to customize sparse local models on the edge.
arXiv Detail & Related papers (2022-01-27T08:43:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.