Towards Efficient Model-Heterogeneity Federated Learning for Large Models
- URL: http://arxiv.org/abs/2411.16796v1
- Date: Mon, 25 Nov 2024 09:58:51 GMT
- Title: Towards Efficient Model-Heterogeneity Federated Learning for Large Models
- Authors: Ruofan Jia, Weiying Xie, Jie Lei, Haonan Qin, Jitao Ma, Leyuan Fang,
- Abstract summary: We introduce HeteroTune, an innovative fine-tuning framework tailored for model-heterogeneity federated learning (MHFL)
In particular, we propose a novel parameter-efficient fine-tuning structure, called FedAdapter, which employs a multi-branch cross-model aggregator.
Benefiting from the lightweight FedAdapter, our approach significantly reduces both the computational and communication overhead.
- Score: 18.008063521900702
- License:
- Abstract: As demand grows for complex tasks and high-performance applications in edge computing, the deployment of large models in federated learning has become increasingly urgent, given their superior representational power and generalization capabilities. However, the resource constraints and heterogeneity among clients present significant challenges to this deployment. To tackle these challenges, we introduce HeteroTune, an innovative fine-tuning framework tailored for model-heterogeneity federated learning (MHFL). In particular, we propose a novel parameter-efficient fine-tuning (PEFT) structure, called FedAdapter, which employs a multi-branch cross-model aggregator to enable efficient knowledge aggregation across diverse models. Benefiting from the lightweight FedAdapter, our approach significantly reduces both the computational and communication overhead. Finally, our approach is simple yet effective, making it applicable to a wide range of large model fine-tuning tasks. Extensive experiments on computer vision (CV) and natural language processing (NLP) tasks demonstrate that our method achieves state-of-the-art results, seamlessly integrating efficiency and performance.
Related papers
- FedMHO: Heterogeneous One-Shot Federated Learning Towards Resource-Constrained Edge Devices [12.08958206272527]
Federated Learning (FL) is increasingly adopted in edge computing scenarios, where a large number of heterogeneous clients operate under constrained or sufficient resources.
One-shot FL has emerged as a promising approach to mitigate communication overhead, and model-heterogeneous FL solves the problem of diverse computing resources across clients.
We propose a novel FL framework named FedMHO, which leverages deep classification models on resource-sufficient clients and lightweight generative models on resource-constrained devices.
arXiv Detail & Related papers (2025-02-12T15:54:56Z) - Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models.
Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z) - Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization [15.842155380912002]
This work proposes a novel Instance-Conditioned Adaptation Model (ICAM) for better large-scale generalization of neural optimization.
In particular, we design a powerful yet lightweight instance-conditioned Routing adaptation module for the NCO model.
We develop an efficient three-stage reinforcement learning-based training scheme that enables the model to learn cross-scale features without any labeled optimal solution.
arXiv Detail & Related papers (2024-05-03T08:00:19Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Retrieval-based Knowledge Transfer: An Effective Approach for Extreme
Large Language Model Compression [64.07696663255155]
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks.
However, the massive size of these models poses huge challenges for their deployment in real-world applications.
We introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT) which effectively transfers the knowledge of LLMs to extremely small-scale models.
arXiv Detail & Related papers (2023-10-24T07:58:20Z) - FedBone: Towards Large-Scale Federated Multi-Task Learning [13.835972363413884]
In real-world applications, visual and natural language tasks typically require large-scale models to extract high-level abstract features.
Existing HFML methods disregard the impact of gradient conflicts on multi-task optimization.
We propose an innovative framework called FedBone, which enables the construction of large-scale models with better generalization.
arXiv Detail & Related papers (2023-06-30T08:19:38Z) - HyperTransformer: Model Generation for Supervised and Semi-Supervised
Few-Shot Learning [14.412066456583917]
We propose a transformer-based model for few-shot learning that generates weights of a convolutional neural network (CNN) directly from support samples.
Our method is particularly effective for small target CNN architectures where learning a fixed universal task-independent embedding is not optimal.
We extend our approach to a semi-supervised regime utilizing unlabeled samples in the support set and further improving few-shot performance.
arXiv Detail & Related papers (2022-01-11T20:15:35Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Learning High-Dimensional Distributions with Latent Neural Fokker-Planck
Kernels [67.81799703916563]
We introduce new techniques to formulate the problem as solving Fokker-Planck equation in a lower-dimensional latent space.
Our proposed model consists of latent-distribution morphing, a generator and a parameterized Fokker-Planck kernel function.
arXiv Detail & Related papers (2021-05-10T17:42:01Z) - FG-Net: Fast Large-Scale LiDAR Point CloudsUnderstanding Network
Leveraging CorrelatedFeature Mining and Geometric-Aware Modelling [15.059508985699575]
FG-Net is a general deep learning framework for large-scale point clouds understanding without voxelizations.
We propose a deep convolutional neural network leveraging correlated feature mining and deformable convolution based geometric-aware modelling.
Our approaches outperform state-of-the-art approaches in terms of accuracy and efficiency.
arXiv Detail & Related papers (2020-12-17T08:20:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.