Related papers: Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures

Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures

URL: http://arxiv.org/abs/2411.19128v1
Date: Thu, 28 Nov 2024 13:20:38 GMT
Title: Personalized Federated Fine-Tuning for LLMs via Data-Driven Heterogeneous Model Architectures
Authors: Yicheng Zhang, Zhen Qin, Zhaomin Wu, Shuiguang Deng,
Abstract summary: A large amount of instructional text data is essential to enhance the performance of large language models.<n>FedAMoLE is a lightweight personalized federated fine-tuning framework.
Score: 16.334964586540178
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: A large amount of instructional text data is essential to enhance the performance of pre-trained large language models (LLMs) for downstream tasks. This data can contain sensitive information and therefore cannot be shared in practice, resulting in data silos that limit the effectiveness of LLMs on various tasks. Federated learning (FL) enables collaborative fine-tuning across different clients without sharing their data. Nonetheless, in practice, this instructional text data is highly heterogeneous in both quantity and distribution across clients, necessitating distinct model structures to best accommodate the variations. However, existing federated fine-tuning approaches either enforce the same model structure or rely on predefined ad-hoc architectures unaware of data distribution, resulting in suboptimal performance. To address this challenge, we propose FedAMoLE, a lightweight personalized federated fine-tuning framework that leverages data-driven heterogeneous model architectures. FedAMoLE introduces the Adaptive Mixture of LoRA Experts (AMoLE) module, which facilitates model heterogeneity with minimal communication overhead by allocating varying numbers of LoRA-based domain experts to each client. Furthermore, we develop a reverse selection-based expert assignment (RSEA) strategy, which enables data-driven model architecture adjustment during fine-tuning by allowing domain experts to select clients that best align with their knowledge domains. Extensive experiments across six different scenarios of data heterogeneity demonstrate that FedAMoLE significantly outperforms existing methods for federated LLM fine-tuning, achieving superior accuracy while maintaining good scalability.

Related papers

FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors [50.131271229165165]
Federated Learning (FL) has emerged as a promising framework for distributed machine learning. Data heterogeneity resulting from differences across user behaviors, preferences, and device characteristics poses a significant challenge for federated learning. We propose Adaptive Weight Aggregation (FedAWA), a novel method that adaptively adjusts aggregation weights based on client vectors during the learning process.
arXiv Detail & Related papers (2025-03-20T04:49:40Z)
Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes [50.544186914115045]
Large language models (LLMs) are increasingly embedded in everyday applications. Ensuring their alignment with the diverse preferences of individual users has become a critical challenge. We present a novel framework for few-shot steerable alignment.
arXiv Detail & Related papers (2024-12-18T16:14:59Z)
Model-GLUE: Democratized LLM Scaling for A Large Model Zoo in the Wild [84.57103623507082]
This paper introduces Model-GLUE, a holistic Large Language Models scaling guideline. We benchmark existing scaling techniques, especially selective merging, and variants of mixture. We then formulate an optimal strategy for the selection and aggregation of a heterogeneous model zoo. Our methodology involves the clustering of mergeable models and optimal merging strategy selection, and the integration of clusters.
arXiv Detail & Related papers (2024-10-07T15:55:55Z)
FedMoE: Personalized Federated Learning via Heterogeneous Mixture of Experts [4.412721048192925]
We present FedMoE, the efficient personalized Federated Learning framework to address data heterogeneity. FedMoE is composed of two fine-tuning stages. In the first stage, FedMoE simplifies the problem by conducting a search based on observed activation patterns. In the second stage, these submodels are distributed to clients for further training and returned for server aggregating.
arXiv Detail & Related papers (2024-08-21T03:16:12Z)
FedMAP: Unlocking Potential in Personalized Federated Learning through Bi-Level MAP Optimization [11.040916982022978]
Federated Learning (FL) enables collaborative training of machine learning models on decentralized data. Data across clients often differs significantly due to class imbalance, feature distribution skew, sample size imbalance, and other phenomena. We propose a novel Bayesian PFL framework using bi-level optimization to tackle the data heterogeneity challenges.
arXiv Detail & Related papers (2024-05-29T11:28:06Z)
An Aggregation-Free Federated Learning for Tackling Data Heterogeneity [50.44021981013037]
Federated Learning (FL) relies on the effectiveness of utilizing knowledge from distributed datasets. Traditional FL methods adopt an aggregate-then-adapt framework, where clients update local models based on a global model aggregated by the server from the previous training round. We introduce FedAF, a novel aggregation-free FL algorithm.
arXiv Detail & Related papers (2024-04-29T05:55:23Z)
CoDream: Exchanging dreams instead of models for federated aggregation with heterogeneous models [8.85591781936764]
We present a novel framework called CoDream, where clients collaboratively optimize randomly data. Our key insight is that jointly optimizing this data can effectively capture the properties of the global data distribution. We empirically validate CoDream on standard FL tasks, demonstrating competitive performance despite not sharing model parameters.
arXiv Detail & Related papers (2024-02-25T03:07:32Z)
FLASH: Federated Learning Across Simultaneous Heterogeneities [54.80435317208111]
FLASH(Federated Learning Across Simultaneous Heterogeneities) is a lightweight and flexible client selection algorithm. It outperforms state-of-the-art FL frameworks under extensive sources of Heterogeneities. It achieves substantial and consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2024-02-13T20:04:39Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
MLLM-DataEngine: An Iterative Refinement Approach for MLLM [62.30753425449056]
We propose a novel closed-loop system that bridges data generation, model training, and evaluation. Within each loop, the MLLM-DataEngine first analyze the weakness of the model based on the evaluation results. For targeting, we propose an Adaptive Bad-case Sampling module, which adjusts the ratio of different types of data. For quality, we resort to GPT-4 to generate high-quality data with each given data type.
arXiv Detail & Related papers (2023-08-25T01:41:04Z)
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning [37.96957782129352]
We propose a finetuning framework tailored to heterogeneous multi-modal foundation models, called Federated Dual-Aadapter Teacher (Fed DAT) Fed DAT addresses data heterogeneity by regularizing the client local updates and applying Mutual Knowledge Distillation (MKD) for an efficient knowledge transfer. To demonstrate its effectiveness, we conduct extensive experiments on four multi-modality FL benchmarks with different types of data heterogeneity.
arXiv Detail & Related papers (2023-08-21T21:57:01Z)
Towards Personalized Federated Learning via Heterogeneous Model Reassembly [84.44268421053043]
pFedHR is a framework that leverages heterogeneous model reassembly to achieve personalized federated learning. pFedHR dynamically generates diverse personalized models in an automated manner.
arXiv Detail & Related papers (2023-08-16T19:36:01Z)
Dataless Knowledge Fusion by Merging Weights of Language Models [51.8162883997512]
Fine-tuning pre-trained language models has become the prevalent paradigm for building downstream NLP models. This creates a barrier to fusing knowledge across individual models to yield a better single model. We propose a dataless knowledge fusion method that merges models in their parameter space.
arXiv Detail & Related papers (2022-12-19T20:46:43Z)
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry. We propose FedDM to build the global training objective from multiple local surrogate functions. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)
Adaptive Expert Models for Personalization in Federated Learning [0.9449650062296824]
Federated Learning (FL) is a promising framework for distributed learning when data is private and sensitive. We propose a practical and robust approach to personalization in FL that adjusts to heterogeneous and non-IID data. We show that our approach achieves an accuracy up to 29.78 % and up to 4.38 % better compared to a local model in a pathological non-IID setting.
arXiv Detail & Related papers (2022-06-15T22:05:36Z)
Federated Learning in Non-IID Settings Aided by Differentially Private Synthetic Data [20.757477553095637]
Federated learning (FL) is a privacy-promoting framework that enables clients to collaboratively train machine learning models. A major challenge in federated learning arises when the local data is heterogeneous. We propose FedDPMS, an FL algorithm in which clients deploy variational auto-encoders to augment local datasets with data synthesized using differentially private means of latent data representations.
arXiv Detail & Related papers (2022-06-01T18:00:48Z)
Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning [22.310090483499035]
Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server. We propose a novel ensemble knowledge transfer method named Fed-ET in which small models are trained on clients, and used to train a larger model at the server.
arXiv Detail & Related papers (2022-04-27T05:18:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.