Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation
Models
- URL: http://arxiv.org/abs/2401.06432v2
- Date: Tue, 20 Feb 2024 21:15:59 GMT
- Title: Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation
Models
- Authors: Yae Jee Cho and Luyang Liu and Zheng Xu and Aldi Fahrezi and Gauri
Joshi
- Abstract summary: HetLoRA allows heterogeneous ranks across client devices and efficiently aggregates and distributes these heterogeneous LoRA modules.
HetLoRA achieves improved convergence speed and final performance compared to homogeneous LoRA.
- Score: 20.707283766914017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foundation models (FMs) adapt well to specific domains or tasks with
fine-tuning, and federated learning (FL) enables the potential for
privacy-preserving fine-tuning of the FMs with on-device local data. For
federated fine-tuning of FMs, we consider the FMs with small to medium
parameter sizes of single digit billion at maximum, referred to as on-device
FMs (ODFMs) that can be deployed on devices for inference but can only be
fine-tuned with parameter efficient methods. In our work, we tackle the data
and system heterogeneity problem of federated fine-tuning of ODFMs by proposing
a novel method using heterogeneous low-rank approximations (LoRAs), namely
HetLoRA. First, we show that the naive approach of using homogeneous LoRA ranks
across devices face a trade-off between overfitting and slow convergence, and
thus propose HetLoRA, which allows heterogeneous ranks across client devices
and efficiently aggregates and distributes these heterogeneous LoRA modules. By
applying rank self-pruning locally and sparsity-weighted aggregation at the
server, HetLoRA combines the advantages of high and low-rank LoRAs, which
achieves improved convergence speed and final performance compared to
homogeneous LoRA. Furthermore, HetLoRA offers enhanced computation efficiency
compared to full fine-tuning, making it suitable for federated fine-tuning
across heterogeneous devices.
Related papers
- BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods.
We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z) - Federated Sketching LoRA: On-Device Collaborative Fine-Tuning of Large Language Models [18.782733798668122]
Fine-tuning large language models (LLMs) on devices is attracting increasing interest.
Recent works have fused low-rank adaptation (LoRA) techniques with federated fine-tuning to mitigate challenges associated with device model sizes and data scarcity.
We propose federated sketching LoRA, which leverages a sketching mechanism to enable devices to selectively update submatrices of global LoRA modules maintained by the server.
arXiv Detail & Related papers (2025-01-31T18:44:35Z) - Adaptive Parameter-Efficient Federated Fine-Tuning on Heterogeneous Devices [24.725928966071212]
Federated fine-tuning (FedFT) has been proposed to fine-tune the pre-trained language models in a distributed manner.
We propose a novel LoRA-based FedFT framework, termed LEGEND, which faces the difficulty of determining the number of LoRA layers.
We analyze the coupled relationship between LoRA depth and rank distribution, and design an efficient LoRA configuration algorithm for heterogeneous devices.
arXiv Detail & Related papers (2024-12-28T04:00:42Z) - Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs [76.40876036912537]
Large Language Models (LLMs) demonstrate strong few-shot adaptability without requiring fine-tuning.
Current Visual Foundation Models (VFMs) require explicit fine-tuning with sufficient tuning data.
We propose a framework, LoRA Recycle, that distills a meta-LoRA from diverse pre-tuned LoRAs with a meta-learning objective.
arXiv Detail & Related papers (2024-12-03T07:25:30Z) - LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement [5.162783756846019]
Foundation models (FMs) achieve strong performance across diverse tasks with task-specific fine-tuning.
Low-Rank Adaptation (LoRA) methods like Low-Rank Adaptation (LoRA) reduce this cost by introducing low-rank matrices for tuning fewer parameters.
LoRA-FAIR maintains computational and communication efficiency, yielding superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-11-22T14:19:01Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations [39.88985198467528]
We introduce a new approach called FLORA that enables federated fine-tuning on heterogeneous LoRA adapters.
Our approach is noise-free and seamlessly supports heterogeneous LoRA adapters.
arXiv Detail & Related papers (2024-09-09T18:21:23Z) - LoRA-Pro: Are Low-Rank Adapters Properly Optimized? [121.0693322732454]
Low-rank adaptation, also known as LoRA, has emerged as a prominent method for parameter-efficient fine-tuning of foundation models.
Despite its computational efficiency, LoRA still yields inferior performance compared to full fine-tuning.
We introduce LoRA-Pro, a method that enhances LoRA's performance by strategically adjusting the gradients of low-rank matrices.
arXiv Detail & Related papers (2024-07-25T17:57:12Z) - Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection.
The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z) - Improving LoRA in Privacy-preserving Federated Learning [44.47315926976059]
Low-rank adaptation (LoRA) is one of the most popular task-specific parameter-efficient fine-tuning (PEFT) methods on pre-trained language models.
This paper proposes an efficient and effective version of LoRA, Federated Freeze A LoRA (FFA-LoRA), to alleviate these challenges.
arXiv Detail & Related papers (2024-03-18T23:20:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.