Related papers: SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models

SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models

URL: http://arxiv.org/abs/2308.06522v1
Date: Sat, 12 Aug 2023 10:33:57 GMT
Title: SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models
Authors: Sara Babakniya, Ahmed Roushdy Elkordy, Yahya H. Ezzeldin, Qingfeng Liu, Kee-Bong Song, Mostafa El-Khamy, Salman Avestimehr
Abstract summary: Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning. We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios. Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
Score: 28.764782216513037
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Transfer learning via fine-tuning pre-trained transformer models has gained significant success in delivering state-of-the-art results across various NLP tasks. In the absence of centralized data, Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning. However, due to the limited communication, computation, and storage capabilities of edge devices and the huge sizes of popular transformer models, efficient fine-tuning is crucial to make federated training feasible. This work explores the opportunities and challenges associated with applying parameter efficient fine-tuning (PEFT) methods in different FL settings for language tasks. Specifically, our investigation reveals that as the data across users becomes more diverse, the gap between fully fine-tuning the model and employing PEFT methods widens. To bridge this performance gap, we propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios through a novel data-driven initialization technique. Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning, with significant sparse updates with approximately $\sim 1\%$ density while reducing training time by up to $90\%$.

Related papers

GradualDiff-Fed: A Federated Learning Specialized Framework for Large Language Model [0.0]
We introduce GradualDiff-Fed, an FL framework designed explicitly for large language models (LLMs)<n> GradualDiff-Fed reduces communication costs by transmitting only the difference of model weights rather than the entire model during training rounds.<n>Our evaluation demonstrates that GradualDiff-Fed achieves performance on par with centralized training while drastically reducing communication overhead.
arXiv Detail & Related papers (2025-06-23T22:03:21Z)
Communication-Efficient Wireless Federated Fine-Tuning for Large-Scale AI Models [13.742950928229078]
Low-Rank Adaptation (LoRA) addresses these issues by training compact, low-rank matrices instead of fully fine-tuning large models. This paper introduces a wireless federated LoRA fine-tuning framework that optimize both learning performance and communication efficiency.
arXiv Detail & Related papers (2025-05-01T06:15:38Z)
Efficient Federated Fine-Tuning of Large Language Models with Layer Dropout [15.009864792277236]
Fine-tuning plays a crucial role in enabling pre-trained LLMs to evolve from general language comprehension to task-specific expertise. This work proposes DropPEFT, an innovative federated PEFT framework that employs a novel transformer dropout method. We show that DropPEFT can achieve a 1.3-6.3times speedup in model convergence and a 40%-67% reduction in memory footprint.
arXiv Detail & Related papers (2025-03-13T09:59:16Z)
Over-the-Air Fair Federated Learning via Multi-Objective Optimization [52.295563400314094]
We propose an over-the-air fair federated learning algorithm (OTA-FFL) to train fair FL models. Experiments demonstrate the superiority of OTA-FFL in achieving fairness and robust performance.
arXiv Detail & Related papers (2025-01-06T21:16:51Z)
Communication-Efficient and Tensorized Federated Fine-Tuning of Large Language Models [24.07770417615704]
We introduce FedTT and FedTT+, methods for adapting Large Language Models. FedTT is versatile and can be applied to both cross-silo FL and large-scale cross-device FL. Our proposed methods successfully address data heterogeneity challenges and perform on par or even better than existing federated PEFT approaches.
arXiv Detail & Related papers (2024-10-16T23:50:39Z)
Fisher Information-based Efficient Curriculum Federated Learning with Large Language Models [43.26028399395612]
We propose a Fisher Information-based Efficient Curriculum Federated Learning framework (FibecFed) with two novel methods. First, we propose a fisher information-based method to adaptively sample data within each device to improve the effectiveness of the FL fine-tuning process. Second, we dynamically select the proper layers for global aggregation and sparse parameters for local update with LoRA.
arXiv Detail & Related papers (2024-09-30T18:12:18Z)
Ferret: Federated Full-Parameter Tuning at Scale for Large Language Models [54.02863371927658]
Large Language Models (LLMs) have become indispensable in numerous real-world applications. Ferret is the first first-order method with shared randomness. It achieves high computational efficiency, reduced communication overhead, and fast convergence.
arXiv Detail & Related papers (2024-09-10T07:28:13Z)
FeDeRA:Efficient Fine-tuning of Language Models in Federated Learning Leveraging Weight Decomposition [7.229494183462913]
Despite exceptional performance after fine-tuning, pre-trained language models (PLMs) face significant challenges due to privacy concerns. We consider federated learning (FL) to fine-tune PLMs in this paper. One promising solution is to exploit parameter-efficient fine-tuning (PEFT) into FL, which trains a much smaller set of parameters than full parameter fine-tuning (FFT)
arXiv Detail & Related papers (2024-04-29T16:42:26Z)
Federated Learning with Projected Trajectory Regularization [65.6266768678291]
Federated learning enables joint training of machine learning models from distributed clients without sharing their local data. One key challenge in federated learning is to handle non-identically distributed data across the clients. We propose a novel federated learning framework with projected trajectory regularization (FedPTR) for tackling the data issue.
arXiv Detail & Related papers (2023-12-22T02:12:08Z)
Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions. FedKSeed employs zeroth-order optimization with a finite set of random seeds. It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z)
Federated Learning of Large Language Models with Parameter-Efficient Prompt Tuning and Adaptive Optimization [71.87335804334616]
Federated learning (FL) is a promising paradigm to enable collaborative model training with decentralized data. The training process of Large Language Models (LLMs) generally incurs the update of significant parameters. This paper proposes an efficient partial prompt tuning approach to improve performance and efficiency simultaneously.
arXiv Detail & Related papers (2023-10-23T16:37:59Z)
FedLALR: Client-Specific Adaptive Learning Rates Achieve Linear Speedup for Non-IID Data [54.81695390763957]
Federated learning is an emerging distributed machine learning method. We propose a heterogeneous local variant of AMSGrad, named FedLALR, in which each client adjusts its learning rate. We show that our client-specified auto-tuned learning rate scheduling can converge and achieve linear speedup with respect to the number of clients.
arXiv Detail & Related papers (2023-09-18T12:35:05Z)
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning [37.96957782129352]
We propose a finetuning framework tailored to heterogeneous multi-modal foundation models, called Federated Dual-Aadapter Teacher (Fed DAT) Fed DAT addresses data heterogeneity by regularizing the client local updates and applying Mutual Knowledge Distillation (MKD) for an efficient knowledge transfer. To demonstrate its effectiveness, we conduct extensive experiments on four multi-modality FL benchmarks with different types of data heterogeneity.
arXiv Detail & Related papers (2023-08-21T21:57:01Z)
When Federated Learning Meets Pre-trained Language Models' Parameter-Efficient Tuning Methods [22.16636947999123]
We introduce various parameter-efficient tuning (PETuning) methods into federated learning. Specifically, we provide a holistic empirical study of representative PLMs tuning methods in FL. Overall communication overhead can be significantly reduced by locally tuning and globally aggregating lightweight model parameters.
arXiv Detail & Related papers (2022-12-20T06:44:32Z)
FedDM: Iterative Distribution Matching for Communication-Efficient Federated Learning [87.08902493524556]
Federated learning(FL) has recently attracted increasing attention from academia and industry. We propose FedDM to build the global training objective from multiple local surrogate functions. In detail, we construct synthetic sets of data on each client to locally match the loss landscape from original data.
arXiv Detail & Related papers (2022-07-20T04:55:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.