Related papers: GradualDiff-Fed: A Federated Learning Specialized Framework for Large Language Model

GradualDiff-Fed: A Federated Learning Specialized Framework for Large Language Model

URL: http://arxiv.org/abs/2506.19164v1
Date: Mon, 23 Jun 2025 22:03:21 GMT
Title: GradualDiff-Fed: A Federated Learning Specialized Framework for Large Language Model
Authors: Amir Faiyaz, Tara Salman,
Abstract summary: We introduce GradualDiff-Fed, an FL framework designed explicitly for large language models (LLMs)<n> GradualDiff-Fed reduces communication costs by transmitting only the difference of model weights rather than the entire model during training rounds.<n>Our evaluation demonstrates that GradualDiff-Fed achieves performance on par with centralized training while drastically reducing communication overhead.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid proliferation of large language models (LLMs) has created an unprecedented demand for fine-tuning models for specialized domains, such as medical science. While federated learning (FL) offers a decentralized and privacy-preserving approach to collaboratively fine-tune LLMs without sharing raw data, it presents significant challenges, particularly in performance and managing large model sizes efficiently. In this paper, we introduce GradualDiff-Fed, an FL framework designed explicitly for LLMs, and their challenge of handling the high parameter size. GradualDiff-Fed reduces communication costs by transmitting only the difference of model weights rather than the entire model during training rounds. Such an approach significantly improves scalability and communication efficiency, making it more feasible to fine-tune LLMs across distributed clients without compromising performance. Our evaluation demonstrates that GradualDiff-Fed achieves performance on par with centralized training while drastically reducing communication overhead. These results highlight the potential of GradualDiff-Fed as an efficient solution for fine-tuning large models from distributed data in privacy-preserving settings without comprising performance.

Related papers

Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models.<n>Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z)
Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models [54.02863371927658]
Large Language Models (LLMs) have become indispensable in numerous real-world applications.<n>Ferret is the first first-order method with shared randomness to enable scalable full- parameter tuning of LLMs.<n>Ferret achieves high computational efficiency, reduced communication overhead, and fast convergence.
arXiv Detail & Related papers (2024-09-10T07:28:13Z)
Accelerating Large Language Model Pretraining via LFR Pedagogy: Learn, Focus, and Review [50.78587571704713]
Learn-Focus-Review (LFR) is a dynamic training approach that adapts to the model's learning progress.<n>LFR tracks the model's learning performance across data blocks (sequences of tokens) and prioritizes revisiting challenging regions of the dataset.<n>Compared to baseline models trained on the full datasets, LFR consistently achieved lower perplexity and higher accuracy.
arXiv Detail & Related papers (2024-09-10T00:59:18Z)
R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models [83.77114091471822]
Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML) A challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming. This is particularly pronounced for word embedding parameters in large language models (LLMs), which are crucial for language understanding. A physical layer framework is developed for resilient SFL with LLMs (R-SFLLM) over wireless networks.
arXiv Detail & Related papers (2024-07-16T12:21:29Z)
One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity [26.09617693587105]
We improve the one-shot sequential federated learning for non-IID data by proposing a local model diversity-enhancing strategy. Our method exhibits superior performance to existing one-shot PFL methods and achieves better accuracy compared with state-of-the-art one-shot SFL methods.
arXiv Detail & Related papers (2024-04-18T12:31:48Z)
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. The training process of Large Language Models (LLMs) generally incurs the update of significant parameters. This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z)
SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning. We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios. Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z)
Exploring Parameter-Efficient Fine-Tuning to Enable Foundation Models in Federated Learning [12.839398408791778]
Federated learning (FL) has emerged as a promising paradigm for enabling the collaborative training of models without centralized access to the raw data on local devices.<n>Recent state-of-the-art pre-trained models are getting more capable but also have more parameters, known as the "Foundation Models"<n>Can we find a solution to enable those strong and readily available pre-trained models in FL to achieve excellent performance while simultaneously reducing the communication burden?<n>Specifically, we systemically evaluate the performance of FedPEFT across a variety of client stability, data distribution, and differential privacy settings.
arXiv Detail & Related papers (2022-10-04T16:08:54Z)
FedOBD: Opportunistic Block Dropout for Efficiently Training Large-scale Neural Networks through Federated Learning [22.168178614211158]
We propose the Federated Opportunistic Block Dropout (FedOBD) approach to train large-scale neural networks.<n>FedOBD decomposes large-scale models into semantic blocks so that FL participants can opportunistically upload quantized blocks.<n>Experiments show that FedOBD reduces the overall communication overhead by more than 88% compared to the best performing baseline approach.
arXiv Detail & Related papers (2022-08-10T06:36:49Z)
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry. We propose FedDM to build the global training objective from multiple local surrogate functions. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.