Related papers: Memory-adaptive Depth-wise Heterogenous Federated Learning

Memory-adaptive Depth-wise Heterogenous Federated Learning

URL: http://arxiv.org/abs/2303.04887v2
Date: Wed, 10 Jan 2024 18:03:01 GMT
Title: Memory-adaptive Depth-wise Heterogenous Federated Learning
Authors: Kai Zhang, Yutong Dai, Hongyi Wang, Eric Xing, Xun Chen, Lichao Sun
Abstract summary: We introduce a memory-adaptive depth-wise learning solution in FL called FeDepth, which adaptively decomposes the full model into blocks according to the memory budgets of each client. Our method outperforms state-of-the-art approaches, achieving 5% and more than 10% improvements in top-1 accuracy on CIFAR-10 and CIFAR-100, respectively.
Score: 24.13198329419849
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated learning is a promising paradigm that allows multiple clients to collaboratively train a model without sharing the local data. However, the presence of heterogeneous devices in federated learning, such as mobile phones and IoT devices with varying memory capabilities, would limit the scale and hence the performance of the model could be trained. The mainstream approaches to address memory limitations focus on width-slimming techniques, where different clients train subnetworks with reduced widths locally and then the server aggregates the subnetworks. The global model produced from these methods suffers from performance degradation due to the negative impact of the actions taken to handle the varying subnetwork widths in the aggregation phase. In this paper, we introduce a memory-adaptive depth-wise learning solution in FL called FeDepth, which adaptively decomposes the full model into blocks according to the memory budgets of each client and trains blocks sequentially to obtain a full inference model. Our method outperforms state-of-the-art approaches, achieving 5% and more than 10% improvements in top-1 accuracy on CIFAR-10 and CIFAR-100, respectively. We also demonstrate the effectiveness of depth-wise fine-tuning on ViT. Our findings highlight the importance of memory-aware techniques for federated learning with heterogeneous devices and the success of depth-wise training strategy in improving the global model's performance.

Related papers

Any Image Restoration via Efficient Spatial-Frequency Degradation Adaptation [158.37640586809187]
Restoring any degraded image efficiently via just one model has become increasingly significant. Our approach, termed AnyIR, takes a unified path that leverages inherent similarity across various degradations. To fuse the degradation awareness and the contextualized attention, a spatial-frequency parallel fusion strategy is proposed.
arXiv Detail & Related papers (2025-04-19T09:54:46Z)
Embedded Federated Feature Selection with Dynamic Sparse Training: Balancing Accuracy-Cost Tradeoffs [1.749521391198341]
We present textitDynamic Sparse Federated Feature Selection (DSFFS), the first innovative embedded FFS that is efficient in both communication and computation. During training, input-layer neurons, their connections, and hidden-layer connections are dynamically pruned and regrown, eliminating uninformative features. Several experiments are conducted on nine real-world datasets, including biology, image, speech, and text.
arXiv Detail & Related papers (2025-04-07T16:33:05Z)
Embracing Federated Learning: Enabling Weak Client Participation via Partial Model Training [21.89214794178211]
In Federated Learning (FL), clients may have weak devices that cannot train the full model or even hold it in their memory space. We propose EmbracingFL, a general FL framework that allows all available clients to join the distributed training. Our empirical study shows that EmbracingFL consistently achieves high accuracy as like all clients are strong, outperforming the state-of-the-art width reduction methods.
arXiv Detail & Related papers (2024-06-21T13:19:29Z)
Federated Learning with Flexible Architectures [12.800116749927266]
This paper introduces Federated Learning with Flexible Architectures (FedFA), an FL training algorithm that allows clients to train models of different widths and depths. FedFA incorporates the layer grafting technique to align clients' local architectures with the largest network architecture in the FL system during model aggregation.
arXiv Detail & Related papers (2024-06-14T09:44:46Z)
Heterogeneous Federated Learning with Splited Language Model [22.65325348176366]
Federated Split Learning (FSL) is a promising distributed learning paradigm in practice. In this paper, we harness Pre-trained Image Transformers (PITs) as the initial model, coined FedV, to accelerate the training process and improve model robustness. We are the first to provide a systematic evaluation of FSL methods with PITs in real-world datasets, different partial device participations, and heterogeneous data splits.
arXiv Detail & Related papers (2024-03-24T07:33:08Z)
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters [65.15700861265432]
We present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models. Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters. To preserve the zero-shot recognition capability of vision-language models, we introduce a Distribution Discriminative Auto-Selector.
arXiv Detail & Related papers (2024-03-18T08:00:23Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Submodel Partitioning in Hierarchical Federated Learning: Algorithm Design and Convergence Analysis [15.311309249848739]
Hierarchical learning (FL) has demonstrated promising scalability advantages over the traditional "star-topology" architecture-based federated learning (FL) In this paper, we propose independent sub training overconstrained Internet of Things (IoT) Key idea behind HIST is a global version of model computation, where we partition the global model into disjoint submodels in each round, and distribute them across different cells.
arXiv Detail & Related papers (2023-10-27T04:42:59Z)
FedYolo: Augmenting Federated Learning with Pretrained Transformers [61.56476056444933]
In this work, we investigate pretrained transformers (PTF) to achieve on-device learning goals. We show that larger scale shrinks the accuracy gaps between alternative approaches and improves robustness. Finally, it enables clients to solve multiple unrelated tasks simultaneously using a single PTF.
arXiv Detail & Related papers (2023-07-10T21:08:52Z)
Adaptive Parameterization of Deep Learning Models for Federated Learning [85.82002651944254]
Federated Learning offers a way to train deep neural networks in a distributed fashion. It incurs a communication overhead as the model parameters or gradients need to be exchanged regularly during training. In this paper, we propose to utilise parallel Adapters for Federated Learning.
arXiv Detail & Related papers (2023-02-06T17:30:33Z)
No One Left Behind: Inclusive Federated Learning over Heterogeneous Devices [79.16481453598266]
We propose InclusiveFL, a client-inclusive federated learning method to handle this problem. The core idea of InclusiveFL is to assign models of different sizes to clients with different computing capabilities. We also propose an effective method to share the knowledge among multiple local models with different sizes.
arXiv Detail & Related papers (2022-02-16T13:03:27Z)
Quasi-Global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data [77.88594632644347]
Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients' local datasets poses an optimization challenge. We propose a novel momentum-based method to mitigate this decentralized training difficulty.
arXiv Detail & Related papers (2021-02-09T11:27:14Z)
Accelerating Federated Learning over Reliability-Agnostic Clients in Mobile Edge Computing Systems [15.923599062148135]
Federated learning has emerged as a promising privacy-preserving approach to facilitating AI applications. It remains a big challenge to optimize the efficiency and effectiveness of FL when it is integrated with the MEC architecture. In this paper, a multi-layer federated learning protocol called HybridFL is designed for the MEC architecture.
arXiv Detail & Related papers (2020-07-28T17:35:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.