HeteroTune: Efficient Federated Learning for Large Heterogeneous Models
- URL: http://arxiv.org/abs/2411.16796v2
- Date: Mon, 25 Aug 2025 16:33:35 GMT
- Title: HeteroTune: Efficient Federated Learning for Large Heterogeneous Models
- Authors: Ruofan Jia, Weiying Xie, Jie Lei, Jitao Ma, Haonan Qin, Leyuan Fang,
- Abstract summary: We propose HeteroTune, a novel federated fine-tuning paradigm for large, heterogeneous models operating under limited communication and budgets.<n>The core of our method lies in a novel architecture, DeMA, which enables flexible and efficient aggregation of heterogeneous models.<n>We provide both theoretical analysis and empirical evidence showing that HeteroTune achieves state-of-the-art performance and efficiency across diverse tasks and model architectures.
- Score: 35.53420882449293
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While large pre-trained models have achieved impressive performance across AI tasks, their deployment in privacy-sensitive and distributed environments remains challenging. Federated learning (FL) offers a viable solution by enabling decentralized fine-tuning without data sharing, but real-world applications face significant obstacles due to heterogeneous client resources in compute and memory. To address this, we propose HeteroTune, a novel federated fine-tuning paradigm for large, heterogeneous models operating under limited communication and computation budgets. The core of our method lies in a novel architecture, DeMA (Dense Mixture of Adapters), which enables flexible and efficient aggregation of heterogeneous models by preserving their full representational capacity while facilitating seamless cross-model knowledge fusion. We further introduce CMGA (Cross-Model Gradient Alignment), a lightweight yet effective mechanism that enhances training stability by harmonizing gradient directions across heterogeneous client models during aggregation, mitigating update conflicts and promoting more consistent convergence in federated settings. We provide both theoretical analysis and empirical evidence showing that HeteroTune achieves state-of-the-art performance and efficiency across diverse tasks and model architectures. For example, on LLaMA models, it reduces communication overhead by 99.5%, cuts peak memory usage by ~50%, and improves performance by 4.61%.
Related papers
- SMoFi: Step-wise Momentum Fusion for Split Federated Learning on Heterogeneous Data [11.41105795202393]
Split Federated Learning uses rich computing resources at a central server to train model partitions.<n>Data heterogeneity across silos presents a major challenge undermining the convergence speed and accuracy of the global model.<n>This paper introduces Step-wise Momentum Fusion (SMoFi), an effective and lightweight framework that counteracts gradient divergence.
arXiv Detail & Related papers (2025-11-13T00:21:05Z) - Nonparametric Data Attribution for Diffusion Models [57.820618036556084]
Data attribution for generative models seeks to quantify the influence of individual training examples on model outputs.<n>We propose a nonparametric attribution method that operates entirely on data, measuring influence via patch-level similarity between generated and training images.
arXiv Detail & Related papers (2025-10-16T03:37:16Z) - Efficient Federated Learning with Heterogeneous Data and Adaptive Dropout [62.73150122809138]
Federated Learning (FL) is a promising distributed machine learning approach that enables collaborative training of a global model using multiple edge devices.<n>We propose the FedDHAD FL framework, which comes with two novel methods: Dynamic Heterogeneous model aggregation (FedDH) and Adaptive Dropout (FedAD)<n>The combination of these two methods makes FedDHAD significantly outperform state-of-the-art solutions in terms of accuracy (up to 6.7% higher), efficiency (up to 2.02 times faster), and cost (up to 15.0% smaller)
arXiv Detail & Related papers (2025-07-14T16:19:00Z) - High-Fidelity Scientific Simulation Surrogates via Adaptive Implicit Neural Representations [35.71656738800783]
Implicit neural representations (INRs) offer a compact and continuous framework for modeling spatially structured data.<n>Recent approaches address this by introducing additional features along rigid geometric structures.<n>We propose a simple yet effective alternative: Feature-Adaptive INR (FA-INR)
arXiv Detail & Related papers (2025-06-07T16:45:17Z) - Not All Clients Are Equal: Collaborative Model Personalization on Heterogeneous Multi-Modal Clients [59.52341877720199]
We propose FedMosaic, a method that enables knowledge sharing across heterogeneous architectures without huge computational cost.<n>To mimic the real-world task diversity, we propose a multi-modal PFL benchmark spanning 40 distinct tasks with distribution shifts over time.<n>The empirical study shows that FedMosaic outperforms the state-of-the-art PFL methods.
arXiv Detail & Related papers (2025-05-20T09:17:07Z) - FedADP: Unified Model Aggregation for Federated Learning with Heterogeneous Model Architectures [5.348839333572149]
Traditional Federated Learning (FL) faces significant challenges in terms of efficiency and accuracy, particularly in heterogeneous environments.<n>We propose FedADP, a federated learning framework designed to adapt to client heterogeneity by dynamically adjusting model architectures during aggregation.<n>Our experimental results demonstrate that FedADP significantly outperforms existing methods, such as FlexiFed, achieving an accuracy improvement of up to 23.30%.
arXiv Detail & Related papers (2025-05-10T02:57:07Z) - FedMHO: Heterogeneous One-Shot Federated Learning Towards Resource-Constrained Edge Devices [12.08958206272527]
Federated Learning (FL) is increasingly adopted in edge computing scenarios, where a large number of heterogeneous clients operate under constrained or sufficient resources.
One-shot FL has emerged as a promising approach to mitigate communication overhead, and model-heterogeneous FL solves the problem of diverse computing resources across clients.
We propose a novel FL framework named FedMHO, which leverages deep classification models on resource-sufficient clients and lightweight generative models on resource-constrained devices.
arXiv Detail & Related papers (2025-02-12T15:54:56Z) - Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models.
Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z) - FedDUAL: A Dual-Strategy with Adaptive Loss and Dynamic Aggregation for Mitigating Data Heterogeneity in Federated Learning [12.307490659840845]
Federated Learning (FL) combines locally optimized models from various clients into a unified global model.<n>FL encounters significant challenges such as performance degradation, slower convergence, and reduced robustness of the global model.<n>We introduce an innovative dual-strategy approach designed to effectively resolve these issues.
arXiv Detail & Related papers (2024-12-05T18:42:29Z) - FedPAE: Peer-Adaptive Ensemble Learning for Asynchronous and Model-Heterogeneous Federated Learning [9.084674176224109]
Federated learning (FL) enables multiple clients with distributed data sources to collaboratively train a shared model without compromising data privacy.
We introduce Federated Peer-Adaptive Ensemble Learning (FedPAE), a fully decentralized pFL algorithm that supports model heterogeneity and asynchronous learning.
Our approach utilizes a peer-to-peer model sharing mechanism and ensemble selection to achieve a more refined balance between local and global information.
arXiv Detail & Related papers (2024-10-17T22:47:19Z) - Instance-Conditioned Adaptation for Large-scale Generalization of Neural Combinatorial Optimization [15.842155380912002]
This work proposes a novel Instance-Conditioned Adaptation Model (ICAM) for better large-scale generalization of neural optimization.
In particular, we design a powerful yet lightweight instance-conditioned Routing adaptation module for the NCO model.
We develop an efficient three-stage reinforcement learning-based training scheme that enables the model to learn cross-scale features without any labeled optimal solution.
arXiv Detail & Related papers (2024-05-03T08:00:19Z) - Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning.
Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation.
Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z) - When Parameter-efficient Tuning Meets General-purpose Vision-language
Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique.
Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z) - Retrieval-based Knowledge Transfer: An Effective Approach for Extreme
Large Language Model Compression [64.07696663255155]
Large-scale pre-trained language models (LLMs) have demonstrated exceptional performance in various natural language processing (NLP) tasks.
However, the massive size of these models poses huge challenges for their deployment in real-world applications.
We introduce a novel compression paradigm called Retrieval-based Knowledge Transfer (RetriKT) which effectively transfers the knowledge of LLMs to extremely small-scale models.
arXiv Detail & Related papers (2023-10-24T07:58:20Z) - Every Parameter Matters: Ensuring the Convergence of Federated Learning
with Dynamic Heterogeneous Models Reduction [22.567754688492414]
Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks.
Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly.
This paper presents a unifying framework for heterogeneous FL algorithms with online model extraction and provides a general convergence analysis for the first time.
arXiv Detail & Related papers (2023-10-12T19:07:58Z) - Towards Instance-adaptive Inference for Federated Learning [80.38701896056828]
Federated learning (FL) is a distributed learning paradigm that enables multiple clients to learn a powerful global model by aggregating local training.
In this paper, we present a novel FL algorithm, i.e., FedIns, to handle intra-client data heterogeneity by enabling instance-adaptive inference in the FL framework.
Our experiments show that our FedIns outperforms state-of-the-art FL algorithms, e.g., a 6.64% improvement against the top-performing method with less than 15% communication cost on Tiny-ImageNet.
arXiv Detail & Related papers (2023-08-11T09:58:47Z) - FedBone: Towards Large-Scale Federated Multi-Task Learning [13.835972363413884]
In real-world applications, visual and natural language tasks typically require large-scale models to extract high-level abstract features.
Existing HFML methods disregard the impact of gradient conflicts on multi-task optimization.
We propose an innovative framework called FedBone, which enables the construction of large-scale models with better generalization.
arXiv Detail & Related papers (2023-06-30T08:19:38Z) - Fine-tuning Global Model via Data-Free Knowledge Distillation for
Non-IID Federated Learning [86.59588262014456]
Federated Learning (FL) is an emerging distributed learning paradigm under privacy constraint.
We propose a data-free knowledge distillation method to fine-tune the global model in the server (FedFTG)
Our FedFTG significantly outperforms the state-of-the-art (SOTA) FL algorithms and can serve as a strong plugin for enhancing FedAvg, FedProx, FedDyn, and SCAFFOLD.
arXiv Detail & Related papers (2022-03-17T11:18:17Z) - HyperTransformer: Model Generation for Supervised and Semi-Supervised
Few-Shot Learning [14.412066456583917]
We propose a transformer-based model for few-shot learning that generates weights of a convolutional neural network (CNN) directly from support samples.
Our method is particularly effective for small target CNN architectures where learning a fixed universal task-independent embedding is not optimal.
We extend our approach to a semi-supervised regime utilizing unlabeled samples in the support set and further improving few-shot performance.
arXiv Detail & Related papers (2022-01-11T20:15:35Z) - FedHM: Efficient Federated Learning for Heterogeneous Models via
Low-rank Factorization [16.704006420306353]
A scalable federated learning framework should address heterogeneous clients equipped with different computation and communication capabilities.
This paper proposes FedHM, a novel federated model compression framework that distributes the heterogeneous low-rank models to clients and then aggregates them into a global full-rank model.
Our solution enables the training of heterogeneous local models with varying computational complexities and aggregates a single global model.
arXiv Detail & Related papers (2021-11-29T16:11:09Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Learning High-Dimensional Distributions with Latent Neural Fokker-Planck
Kernels [67.81799703916563]
We introduce new techniques to formulate the problem as solving Fokker-Planck equation in a lower-dimensional latent space.
Our proposed model consists of latent-distribution morphing, a generator and a parameterized Fokker-Planck kernel function.
arXiv Detail & Related papers (2021-05-10T17:42:01Z) - FG-Net: Fast Large-Scale LiDAR Point CloudsUnderstanding Network
Leveraging CorrelatedFeature Mining and Geometric-Aware Modelling [15.059508985699575]
FG-Net is a general deep learning framework for large-scale point clouds understanding without voxelizations.
We propose a deep convolutional neural network leveraging correlated feature mining and deformable convolution based geometric-aware modelling.
Our approaches outperform state-of-the-art approaches in terms of accuracy and efficiency.
arXiv Detail & Related papers (2020-12-17T08:20:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.