Related papers: Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices

Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices

URL: http://arxiv.org/abs/2412.20004v1
Date: Sat, 28 Dec 2024 04:00:42 GMT
Title: Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices
Authors: Jun Liu, Yunming Liao, Hongli Xu, Yang Xu, Jianchun Liu, Chen Qian,
Abstract summary: Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner.<n>We propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers.<n>We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices.
Score: 24.725928966071212
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner. However, there are two critical challenges for efficient FedFT in practical applications, i.e., resource constraints and system heterogeneity. Existing works rely on parameter-efficient fine-tuning methods, e.g., low-rank adaptation (LoRA), but with major limitations. Herein, based on the inherent characteristics of FedFT, we observe that LoRA layers with higher ranks added close to the output help to save resource consumption while achieving comparable fine-tuning performance. Then we propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers (called, LoRA depth) and the rank of each LoRA layer (called, rank distribution). We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices, thereby promoting fine-tuning efficiency. Extensive experiments are conducted on a physical platform with 80 commercial devices. The results show that LEGEND can achieve a speedup of 1.5-2.8$\times$ and save communication costs by about 42.3% when achieving the target accuracy, compared to the advanced solutions.

Related papers

Resource-Efficient Federated Fine-Tuning Large Language Models for Heterogeneous Data [16.844142562389443]
Fine-tuning large language models (LLMs) via federated learning, i.e., FedLLM, has been proposed to adapt LLMs for various downstream applications in a privacy-preserving way. To reduce the fine-tuning costs on resource-constrained devices, FedLoRA is proposed to fine-tune only a small subset of model parameters by integrating low-rank adaptation (LoRA) into FedLLM. Here, we propose a hierarchical FedLoRA framework, termed HierFedLoRA, to address these challenges.
arXiv Detail & Related papers (2025-03-27T07:05:22Z)
MSPLoRA: A Multi-Scale Pyramid Low-Rank Adaptation for Efficient Model Fine-Tuning [5.412348391086257]
We propose MSPLoRA, which introduces Global Shared LoRA, Mid-Level Shared LoRA, and Layer-Specific LoRA to capture global patterns, mid-level features, and fine-grained information. Experiments on various NLP tasks demonstrate that MSPLoRA achieves more efficient adaptation and better performance while significantly reducing the number of trainable parameters.
arXiv Detail & Related papers (2025-03-27T07:01:50Z)
BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods. We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z)
LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement [5.162783756846019]
Foundation models (FMs) achieve strong performance across diverse tasks with task-specific fine-tuning. Low-Rank Adaptation (LoRA) methods like Low-Rank Adaptation (LoRA) reduce this cost by introducing low-rank matrices for tuning fewer parameters. LoRA-FAIR maintains computational and communication efficiency, yielding superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-11-22T14:19:01Z)
Exploring Gradient Subspaces: Addressing and Overcoming LoRA's Limitations in Federated Fine-Tuning of Large Language Models [21.953204885495573]
This paper critically analyzes the convergence and performance guarantees of popular FL frameworks utilizing Low-Rank Adaptation (LoRA)<n>We demonstrate that direct weight averaging outperforms LoRA-based strategies, leading to superior performance for fine-tuned models.<n>Our findings show that GaLore along with direct-weight aggregation is a more effective approach, outperforming federated LoRA methods like FlexLoRA and FFA-LoRA across both text and image modalities.
arXiv Detail & Related papers (2024-10-30T15:23:44Z)
LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization [78.93425154518705]
Low-rank adaption (LoRA) is a widely used parameter-efficient finetuning method for LLM that reduces memory requirements. This paper introduces LoRA-RITE, a novel adaptive matrix preconditioning method for LoRA optimization.
arXiv Detail & Related papers (2024-10-27T22:57:12Z)
Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models [5.1613368481802455]
Low-Rank Adaptation (LoRA) is a popular technique for efficient fine-tuning of foundation models.<n>We propose Federated Exact LoRA, or FedExLoRA, which adds a residual error term to the pretrained frozen weight matrix.<n>Our approach achieves exact updates with minimal computational and communication overhead, preserving LoRA's efficiency.
arXiv Detail & Related papers (2024-10-12T08:22:44Z)
Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models. LoRA often under performs when compared to full- parameter fine-tuning. We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z)
LoRA-Pro: Are Low-Rank Adapters Properly Optimized? [121.0693322732454]
Low-rank adaptation, also known as LoRA, has emerged as a prominent method for parameter-efficient fine-tuning of foundation models. Despite its computational efficiency, LoRA still yields inferior performance compared to full fine-tuning. We introduce LoRA-Pro, a method that enhances LoRA's performance by strategically adjusting the gradients of low-rank matrices.
arXiv Detail & Related papers (2024-07-25T17:57:12Z)
Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection. The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z)
Improving LoRA in Privacy-preserving Federated Learning [44.47315926976059]
Low-rank adaptation (LoRA) is one of the most popular task-specific parameter-efficient fine-tuning (PEFT) methods on pre-trained language models. This paper proposes an efficient and effective version of LoRA, Federated Freeze A LoRA (FFA-LoRA), to alleviate these challenges.
arXiv Detail & Related papers (2024-03-18T23:20:08Z)
DoRA: Weight-Decomposed Low-Rank Adaptation [57.68678247436207]
We introduce a novel weight decomposition analysis to investigate the inherent differences between FT and LoRA. Aiming to resemble the learning capacity of FT from the findings, we propose Weight-Decomposed Low-Rank Adaptation (DoRA) DoRA decomposes the pre-trained weight into two components, magnitude and direction, for fine-tuning.
arXiv Detail & Related papers (2024-02-14T17:59:34Z)
Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation Models [20.707283766914017]
HetLoRA allows heterogeneous ranks across client devices and efficiently aggregates and distributes these heterogeneous LoRA modules. HetLoRA achieves improved convergence speed and final performance compared to homogeneous LoRA.
arXiv Detail & Related papers (2024-01-12T07:52:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.