Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA
- URL: http://arxiv.org/abs/2409.02346v1
- Date: Wed, 4 Sep 2024 00:20:55 GMT
- Title: Robust Federated Finetuning of Foundation Models via Alternating Minimization of LoRA
- Authors: Shuangyi Chen, Yue Ju, Hardik Dalal, Zhongwen Zhu, Ashish Khisti,
- Abstract summary: RoLoRA is a robust federated fine-tuning framework that utilizes an alternating approach for LoRA.
Our results indicate that RoLoRA not only presents the communication benefits but also substantially enhances the robustness and effectiveness in multiple federated fine-tuning scenarios.
- Score: 14.789886179102425
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parameter-Efficient Fine-Tuning (PEFT) has risen as an innovative training strategy that updates only a select few model parameters, significantly lowering both computational and memory demands. PEFT also helps to decrease data transfer in federated learning settings, where communication depends on the size of updates. In this work, we explore the constraints of previous studies that integrate a well-known PEFT method named LoRA with federated fine-tuning, then introduce RoLoRA, a robust federated fine-tuning framework that utilizes an alternating minimization approach for LoRA, providing greater robustness against decreasing fine-tuning parameters and increasing data heterogeneity. Our results indicate that RoLoRA not only presents the communication benefits but also substantially enhances the robustness and effectiveness in multiple federated fine-tuning scenarios.
Related papers
- Fed-piLot: Optimizing LoRA Assignment for Efficient Federated Foundation Model Fine-Tuning [11.10244162253018]
We introduce Fed-piLot, an efficient FedFM fine-tuning framework with optimized local LoRA assignments for heterogeneous clients.
We design a Local-Global Information Gain Score (IG-Score) based value function to optimize LoRA assignment under clients' memory constraints.
Experimental results on three datasets under both IID and non-IID conditions demonstrate the effectiveness and efficiency of Fed-piLot.
arXiv Detail & Related papers (2024-10-14T06:36:41Z) - Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models [5.1613368481802455]
Low-Rank Adaptation (LoRA) is a popular technique for efficient fine-tuning of foundation models.
We propose Federated Exact LoRA, or FedEx-LoRA, which adds a residual error term to the pretrained frozen weight matrix.
Our approach achieves exact updates with minimal computational and communication overhead, preserving LoRA's efficiency.
arXiv Detail & Related papers (2024-10-12T08:22:44Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - LoRTA: Low Rank Tensor Adaptation of Large Language Models [70.32218116940393]
Low Rank Adaptation (LoRA) is a popular Efficient Fine Tuning (PEFT) method that effectively adapts large pre-trained models for downstream tasks.
We propose a novel approach that employs a low rank tensor parametrization for model updates.
Our method is both efficient and effective for fine-tuning large language models, achieving a substantial reduction in the number of parameters while maintaining comparable performance.
arXiv Detail & Related papers (2024-10-05T06:59:50Z) - FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations [39.88985198467528]
We introduce a new approach called FLORA that enables federated fine-tuning on heterogeneous LoRA adapters.
Our approach is noise-free and seamlessly supports heterogeneous LoRA adapters.
arXiv Detail & Related papers (2024-09-09T18:21:23Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - SA-FedLora: Adaptive Parameter Allocation for Efficient Federated Learning with LoRA Tuning [6.125512669585788]
We propose a Simulated Annealing-based Federated Learning with LoRA tuning (SA-FedLoRA) approach by reducing trainable parameters.
Experimental results demonstrate that SA-FedLoRA is an efficient FL, achieving superior performance to FedAvg and significantly reducing communication parameters by up to 93.62%.
arXiv Detail & Related papers (2024-05-15T14:50:46Z) - DoRA: Weight-Decomposed Low-Rank Adaptation [57.68678247436207]
We introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA.
Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA)
DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning.
arXiv Detail & Related papers (2024-02-14T17:59:34Z) - Sparse Low-rank Adaptation of Pre-trained Language Models [79.74094517030035]
We introduce sparse low-rank adaptation (SoRA) that enables dynamic adjustments to the intrinsic rank during the adaptation process.
Our approach strengthens the representation power of LoRA by initializing it with a higher rank, while efficiently taming a temporarily increased number of parameters.
Our experimental results demonstrate that SoRA can outperform other baselines even with 70% retained parameters and 70% training time.
arXiv Detail & Related papers (2023-11-20T11:56:25Z) - Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared
Pre-trained Language Models [109.06052781040916]
We introduce a technique to enhance the inference efficiency of parameter-shared language models.
We also propose a simple pre-training technique that leads to fully or partially shared models.
Results demonstrate the effectiveness of our methods on both autoregressive and autoencoding PLMs.
arXiv Detail & Related papers (2023-10-19T15:13:58Z) - SLoRA: Federated Parameter Efficient Fine-Tuning of Language Models [28.764782216513037]
Federated Learning (FL) can benefit from distributed and private data of the FL edge clients for fine-tuning.
We propose a method called SLoRA, which overcomes the key limitations of LoRA in high heterogeneous data scenarios.
Our experimental results demonstrate that SLoRA achieves performance comparable to full fine-tuning.
arXiv Detail & Related papers (2023-08-12T10:33:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.