Data Heterogeneity and Forgotten Labels in Split Federated Learning
- URL: http://arxiv.org/abs/2511.09736v1
- Date: Fri, 14 Nov 2025 01:07:08 GMT
- Title: Data Heterogeneity and Forgotten Labels in Split Federated Learning
- Authors: Joana Tirana, Dimitra Tsigkari, David Solans Noguero, Nicolas Kourtellis,
- Abstract summary: We study the phenomenon of catastrophic forgetting (CF) in Split Federated Learning (SFL)<n>We propose Hydra, a novel mitigation method inspired by multi-head neural networks and adapted for the SFL's setting.
- Score: 4.776823105565284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Split Federated Learning (SFL), the clients collaboratively train a model with the help of a server by splitting the model into two parts. Part-1 is trained locally at each client and aggregated by the aggregator at the end of each round. Part-2 is trained at a server that sequentially processes the intermediate activations received from each client. We study the phenomenon of catastrophic forgetting (CF) in SFL in the presence of data heterogeneity. In detail, due to the nature of SFL, local updates of part-1 may drift away from global optima, while part-2 is sensitive to the processing sequence, similar to forgetting in continual learning (CL). Specifically, we observe that the trained model performs better in classes (labels) seen at the end of the sequence. We investigate this phenomenon with emphasis on key aspects of SFL, such as the processing order at the server and the cut layer. Based on our findings, we propose Hydra, a novel mitigation method inspired by multi-head neural networks and adapted for the SFL's setting. Extensive numerical evaluations show that Hydra outperforms baselines and methods from the literature.
Related papers
- Catastrophic Forgetting Resilient One-Shot Incremental Federated Learning [3.4263731151809593]
This paper presents One-Shot Incremental Federated Learning (OSI-FL), the first FL framework that addresses the dual challenges of communication overhead and catastrophic forgetting.<n>OSI-FL communicates category-specific embeddings, devised by a frozen vision-language model (VLM) from each client in a single communication round.<n>We augment training with Selective Sample Retention (SSR), which identifies and retains the top-p most informative samples per category and task pair.
arXiv Detail & Related papers (2026-02-19T18:44:23Z) - CycleSL: Server-Client Cyclical Update Driven Scalable Split Learning [60.59553507555341]
We introduce CycleSL, a novel aggregation-free split learning framework.<n>Inspired by alternating block coordinate descent, CycleSL treats server-side training as an independent higher-level machine learning task.<n>Our empirical findings highlight the effectiveness of CycleSL in enhancing model performance.
arXiv Detail & Related papers (2025-11-23T21:00:21Z) - Collaborative Split Federated Learning with Parallel Training and Aggregation [5.361319869898578]
Collaborative-Split Federated Learning(C-SFL) is a novel scheme that splits the model into three parts.<n>C-SFL enables parallel training and aggregation of model's parts at the clients and at the server.
arXiv Detail & Related papers (2025-04-22T09:18:57Z) - The Impact of Cut Layer Selection in Split Federated Learning [6.481423646861632]
Split Federated Learning (SFL) is a distributed machine learning paradigm that combines federated learning and split learning.<n>In SFL, a neural network is partitioned at a cut layer, with the initial layers deployed on clients and remaining layers on a training server.
arXiv Detail & Related papers (2024-12-20T03:52:54Z) - PFL-GAN: When Client Heterogeneity Meets Generative Models in
Personalized Federated Learning [55.930403371398114]
We propose a novel generative adversarial network (GAN) sharing and aggregation strategy for personalized learning (PFL)
PFL-GAN addresses the client heterogeneity in different scenarios. More specially, we first learn the similarity among clients and then develop an weighted collaborative data aggregation.
The empirical results through the rigorous experimentation on several well-known datasets demonstrate the effectiveness of PFL-GAN.
arXiv Detail & Related papers (2023-08-23T22:38:35Z) - Accelerating Hybrid Federated Learning Convergence under Partial Participation [14.427308569399957]
Federated Learning (FL) involves a group of clients with decentralized data who collaborate to learn a common model.
In realistic scenarios, the server may be able to collect a small amount of data that approximately mimics the population distribution.
We propose a new algorithm called FedCLG, which investigates the two-fold role of the server in hybrid FL.
arXiv Detail & Related papers (2023-04-10T19:13:14Z) - Subspace based Federated Unlearning [75.90552823500633]
Federated unlearning (FL) aims to remove a specified target client's contribution in FL to satisfy the user's right to be forgotten.
Most existing federated unlearning algorithms require the server to store the history of the parameter updates.
We propose a simple-yet-effective subspace based federated unlearning method, dubbed SFU, that lets the global model perform gradient ascent.
arXiv Detail & Related papers (2023-02-24T04:29:44Z) - Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device)
In FL, each data holder trains a model locally and releases it to a central server for aggregation.
In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation).
In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z) - Optimizing Server-side Aggregation For Robust Federated Learning via
Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems.
We propose SmartFL, a generic approach that optimize the server-side aggregation process.
We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z) - Multi-Edge Server-Assisted Dynamic Federated Learning with an Optimized
Floating Aggregation Point [51.47520726446029]
cooperative edge learning (CE-FL) is a distributed machine learning architecture.
We model the processes taken during CE-FL, and conduct analytical training.
We show the effectiveness of our framework with the data collected from a real-world testbed.
arXiv Detail & Related papers (2022-03-26T00:41:57Z) - Acceleration of Federated Learning with Alleviated Forgetting in Local
Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy.
We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage.
Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z) - Splitfed learning without client-side synchronization: Analyzing
client-side split network portion size to overall performance [4.689140226545214]
Federated Learning (FL), Split Learning (SL), and SplitFed Learning (SFL) are three recent developments in distributed machine learning.
This paper studies SFL without client-side model synchronization.
It provides only 1%-2% better accuracy than Multi-head Split Learning on the MNIST test set.
arXiv Detail & Related papers (2021-09-19T22:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.