Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures
- URL: http://arxiv.org/abs/2411.19128v3
- Date: Sat, 17 May 2025 02:54:49 GMT
- Title: Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures
- Authors: Yicheng Zhang, Zhen Qin, Zhaomin Wu, Jian Hou, Shuiguang Deng,
- Abstract summary: Federated Learning (FL) enables collaborative fine-tuning of Large Language Models without accessing raw data.<n>We propose FedAMoLE, a lightweight personalized FL framework that enables data-driven heterogeneous model architectures.<n> Experiments show that FedAMoLE improves client-side performance by an average of 5.14% compared to existing approaches.
- Score: 15.645254436094055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale instruction data is essential for aligning pretrained Large Language Models (LLMs) with human instructions, but may contain sensitive information that hinders its public sharing. Federated Learning (FL) enables collaborative fine-tuning of LLMs without accessing raw data. However, existing approaches to federated LLM fine-tuning usually adopt a uniform model architecture, making it hard to fit highly heterogeneous client-side data in varying domains and formats. To address this, we propose FedAMoLE, a lightweight personalized FL framework that enables data-driven heterogeneous model architectures. This framework features a heterogeneous mixture of LoRA experts module for aggregating architecturally heterogeneous models and a reverse selection-based expert assignment strategy that optimizes model architectures based on data distributions. Experiments across five scenarios show that FedAMoLE improves client-side performance by an average of 5.14% compared to existing approaches while maintaining scalability.
Related papers
- Hypernetworks for Model-Heterogeneous Personalized Federated Learning [13.408669475480824]
We propose a server-side hypernetwork that takes client-specific embedding vectors as input and outputs personalized parameters tailored to each client's heterogeneous model.<n>To promote knowledge sharing and reduce computation, we introduce a multi-head structure within the hypernetwork, allowing clients with similar model sizes to share heads.<n>Our framework does not rely on external datasets and does not require disclosure of client model architectures.
arXiv Detail & Related papers (2025-07-30T02:24:26Z) - Unlocking Personalized Knowledge in Federated Large Language Model: The Power of Mixture of Experts [14.713865726974761]
We propose FLEx, a novel federated learning framework specifically tailored for large language models (LLMs)<n>FLEx efficiently personalizes by pruning the global MoE model to keep only one expert per client, and employs an adaptive gating mechanism to reintegrate personalized experts into the pre-trained MoE layers.<n>These personalized experts are trained with local data and stored locally on each client, while the shared modules are aggregated globally.
arXiv Detail & Related papers (2025-06-01T11:24:43Z) - Not All Clients Are Equal: Personalized Federated Learning on Heterogeneous Multi-Modal Clients [52.14230635007546]
Foundation models have shown remarkable capabilities across diverse multi-modal tasks, but their centralized training raises privacy concerns and induces high transmission costs.<n>For the growing demand for personalizing AI models for different user purposes, personalized federated learning (PFL) has emerged.<n>PFL allows each client to leverage the knowledge of other clients for further adaptation to individual user preferences, again without the need to share data.
arXiv Detail & Related papers (2025-05-20T09:17:07Z) - FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning.
Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning.
We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z) - Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications.
Ensuring their alignment with the diverse preferences of individual users has become a critical challenge.
We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z) - Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild [84.57103623507082]
This paper introduces Model-GLUE, a holistic Large Language Models scaling guideline.
We benchmark existing scaling techniques, especially selective merging, and variants of mixture.
We then formulate an optimal strategy for the selection and aggregation of a heterogeneous model zoo.
Our methodology involves the clustering of mergeable models and optimal merging strategy selection, and the integration of clusters.
arXiv Detail & Related papers (2024-10-07T15:55:55Z) - FedMoE: Personalized Federated Learning via Heterogeneous Mixture of Experts [4.412721048192925]
We present FedMoE, the efficient personalized Federated Learning framework to address data heterogeneity.
FedMoE is composed of two fine-tuning stages. In the first stage, FedMoE simplifies the problem by conducting a search based on observed activation patterns.
In the second stage, these submodels are distributed to clients for further training and returned for server aggregating.
arXiv Detail & Related papers (2024-08-21T03:16:12Z) - FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data.
Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena.
We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z) - An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets.
Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round.
We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z) - MAP: Model Aggregation and Personalization in Federated Learning with Incomplete Classes [49.22075916259368]
In some real-world applications, data samples are usually distributed on local devices.
In this paper, we focus on a special kind of Non-I.I.D. scene where clients own incomplete classes.
Our proposed algorithm named MAP could simultaneously achieve the aggregation and personalization goals in FL.
arXiv Detail & Related papers (2024-04-14T12:22:42Z) - CoDream: Exchanging dreams instead of models for federated aggregation
with heterogeneous models [8.85591781936764]
We present a novel framework called CoDream, where clients collaboratively optimize randomly data.
Our key insight is that jointly optimizing this data can effectively capture the properties of the global data distribution.
We empirically validate CoDream on standard FL tasks, demonstrating competitive performance despite not sharing model parameters.
arXiv Detail & Related papers (2024-02-25T03:07:32Z) - FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm.
It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities.
It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z) - Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data.
One key challenge in federated learning is to handle non-identically distributed data across the clients.
We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z) - One-Shot Heterogeneous Federated Learning with Local Model-Guided Diffusion Models [40.83058938096914]
FedLMG is a one-shot Federated learning method with Local Model-Guided diffusion models.<n>Clients do not need access to any foundation models but only train and upload their local models.
arXiv Detail & Related papers (2023-11-15T11:11:25Z) - MLLM-DataEngine: An Iterative Refinement Approach for MLLM [62.30753425449056]
We propose a novel closed-loop system that bridges data generation, model training, and evaluation.
Within each loop, the MLLM-DataEngine first analyze the weakness of the model based on the evaluation results.
For targeting, we propose an Adaptive Bad-case Sampling module, which adjusts the ratio of different types of data.
For quality, we resort to GPT-4 to generate high-quality data with each given data type.
arXiv Detail & Related papers (2023-08-25T01:41:04Z) - FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal
Heterogeneous Federated Learning [37.96957782129352]
We propose a finetuning framework tailored to heterogeneous multi-modal foundation models, called Federated Dual-Aadapter Teacher (Fed DAT)
Fed DAT addresses data heterogeneity by regularizing the client local updates and applying Mutual Knowledge Distillation (MKD) for an efficient knowledge transfer.
To demonstrate its effectiveness, we conduct extensive experiments on four multi-modality FL benchmarks with different types of data heterogeneity.
arXiv Detail & Related papers (2023-08-21T21:57:01Z) - Towards Personalized Federated Learning via Heterogeneous Model
Reassembly [84.44268421053043]
pFedHR is a framework that leverages heterogeneous model reassembly to achieve personalized federated learning.
pFedHR dynamically generates diverse personalized models in an automated manner.
arXiv Detail & Related papers (2023-08-16T19:36:01Z) - Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models.
This creates a barrier to fusing knowledge across individual models to yield a better single model.
We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z) - FedDM: Iterative Distribution Matching for Communication-Efficient
Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry.
We propose FedDM to build the global training objective from multiple local surrogate functions.
In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z) - Adaptive Expert Models for Personalization in Federated Learning [0.9449650062296824]
Federated Learning (FL) is a promising framework for distributed learning when data is private and sensitive.
We propose a practical and robust approach to personalization in FL that adjusts to heterogeneous and non-IID data.
We show that our approach achieves an accuracy up to 29.78 % and up to 4.38 % better compared to a local model in a pathological non-IID setting.
arXiv Detail & Related papers (2022-06-15T22:05:36Z) - Federated Learning in Non-IID Settings Aided by Differentially Private
Synthetic Data [20.757477553095637]
Federated learning (FL) is a privacy-promoting framework that enables clients to collaboratively train machine learning models.
A major challenge in federated learning arises when the local data is heterogeneous.
We propose FedDPMS, an FL algorithm in which clients deploy variational auto-encoders to augment local datasets with data synthesized using differentially private means of latent data representations.
arXiv Detail & Related papers (2022-06-01T18:00:48Z) - Heterogeneous Ensemble Knowledge Transfer for Training Large Models in
Federated Learning [22.310090483499035]
Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server.
Most existing FL algorithms require models of identical architecture to be deployed across the clients and server.
We propose a novel ensemble knowledge transfer method named Fed-ET in which small models are trained on clients, and used to train a larger model at the server.
arXiv Detail & Related papers (2022-04-27T05:18:32Z) - Federated Mutual Learning [65.46254760557073]
Federated Mutual Leaning (FML) allows clients training a generalized model collaboratively and a personalized model independently.
The experiments show that FML can achieve better performance than alternatives in typical Federated learning setting.
arXiv Detail & Related papers (2020-06-27T09:35:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.