Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation
Models
- URL: http://arxiv.org/abs/2401.06432v2
- Date: Tue, 20 Feb 2024 21:15:59 GMT
- Title: Heterogeneous LoRA for Federated Fine-tuning of On-Device Foundation
Models
- Authors: Yae Jee Cho and Luyang Liu and Zheng Xu and Aldi Fahrezi and Gauri
Joshi
- Abstract summary: HetLoRA allows heterogeneous ranks across client devices and efficiently aggregates and distributes these heterogeneous LoRA modules.
HetLoRA achieves improved convergence speed and final performance compared to homogeneous LoRA.
- Score: 20.707283766914017
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Foundation models (FMs) adapt well to specific domains or tasks with
fine-tuning, and federated learning (FL) enables the potential for
privacy-preserving fine-tuning of the FMs with on-device local data. For
federated fine-tuning of FMs, we consider the FMs with small to medium
parameter sizes of single digit billion at maximum, referred to as on-device
FMs (ODFMs) that can be deployed on devices for inference but can only be
fine-tuned with parameter efficient methods. In our work, we tackle the data
and system heterogeneity problem of federated fine-tuning of ODFMs by proposing
a novel method using heterogeneous low-rank approximations (LoRAs), namely
HetLoRA. First, we show that the naive approach of using homogeneous LoRA ranks
across devices face a trade-off between overfitting and slow convergence, and
thus propose HetLoRA, which allows heterogeneous ranks across client devices
and efficiently aggregates and distributes these heterogeneous LoRA modules. By
applying rank self-pruning locally and sparsity-weighted aggregation at the
server, HetLoRA combines the advantages of high and low-rank LoRAs, which
achieves improved convergence speed and final performance compared to
homogeneous LoRA. Furthermore, HetLoRA offers enhanced computation efficiency
compared to full fine-tuning, making it suitable for federated fine-tuning
across heterogeneous devices.
Related papers
- Fed-piLot: Optimizing LoRA Assignment for Efficient Federated Foundation Model Fine-Tuning [11.10244162253018]
We introduce Fed-piLot, an efficient FedFM fine-tuning framework with optimized local LoRA assignments for heterogeneous clients.
We design a Local-Global Information Gain Score (IG-Score) based value function to optimize LoRA assignment under clients' memory constraints.
Experimental results on three datasets under both IID and non-IID conditions demonstrate the effectiveness and efficiency of Fed-piLot.
arXiv Detail & Related papers (2024-10-14T06:36:41Z) - Exact Aggregation for Federated and Efficient Fine-Tuning of Foundation Models [5.1613368481802455]
Low-Rank Adaptation (LoRA) is a popular technique for efficient fine-tuning of foundation models.
We propose Federated Exact LoRA, or FedEx-LoRA, which adds a residual error term to the pretrained frozen weight matrix.
Our approach achieves exact updates with minimal computational and communication overhead, preserving LoRA's efficiency.
arXiv Detail & Related papers (2024-10-12T08:22:44Z) - Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation [58.288682735160585]
Low-Rank Adaptation (LoRA) is a popular technique for finetuning models.
LoRA often under performs when compared to full- parameter fine-tuning.
We present a framework that rigorously analyzes the adaptation rates of LoRA methods.
arXiv Detail & Related papers (2024-10-10T18:51:53Z) - Flat-LoRA: Low-Rank Adaption over a Flat Loss Landscape [52.98187034726091]
Low-Rank Adaptation (LoRA) is an efficient way to fine-tune models by optimizing only a low-rank matrix.
A solution that appears flat in the LoRA space may exist sharp directions in the full parameter space, potentially harming generalization performance.
We propose Flat-LoRA, an efficient approach that seeks a low-rank adaptation located in a flat region of the full parameter space.
arXiv Detail & Related papers (2024-09-22T11:24:10Z) - FLoRA: Federated Fine-Tuning Large Language Models with Heterogeneous Low-Rank Adaptations [39.88985198467528]
We introduce a new approach called FLORA that enables federated fine-tuning on heterogeneous LoRA adapters.
Our approach is noise-free and seamlessly supports heterogeneous LoRA adapters.
arXiv Detail & Related papers (2024-09-09T18:21:23Z) - LoRA-Pro: Are Low-Rank Adapters Properly Optimized? [121.0693322732454]
Low-rank adaptation, also known as LoRA, has emerged as a prominent method for parameter-efficient fine-tuning of foundation models.
Despite its computational efficiency, LoRA still yields inferior performance compared to full fine-tuning.
We introduce LoRA-Pro, a method that enhances LoRA's performance by strategically adjusting the gradients of low-rank matrices.
arXiv Detail & Related papers (2024-07-25T17:57:12Z) - Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection.
The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z) - Improving LoRA in Privacy-preserving Federated Learning [44.47315926976059]
Low-rank adaptation (LoRA) is one of the most popular task-specific parameter-efficient fine-tuning (PEFT) methods on pre-trained language models.
This paper proposes an efficient and effective version of LoRA, Federated Freeze A LoRA (FFA-LoRA), to alleviate these challenges.
arXiv Detail & Related papers (2024-03-18T23:20:08Z) - pFedLoRA: Model-Heterogeneous Personalized Federated Learning with LoRA
Tuning [35.59830784463706]
Federated learning (FL) is an emerging machine learning paradigm in which a central server coordinates multiple participants (clients) collaboratively to train on decentralized data.
We propose a novel and efficient model-heterogeneous personalized Federated learning framework based on LoRA tuning (pFedLoRA)
Experiments on two benchmark datasets demonstrate that pFedLoRA outperforms six state-of-the-art baselines.
arXiv Detail & Related papers (2023-10-20T05:24:28Z) - Beyond ADMM: A Unified Client-variance-reduced Adaptive Federated
Learning Framework [82.36466358313025]
We propose a primal-dual FL algorithm, termed FedVRA, that allows one to adaptively control the variance-reduction level and biasness of the global model.
Experiments based on (semi-supervised) image classification tasks demonstrate superiority of FedVRA over the existing schemes.
arXiv Detail & Related papers (2022-12-03T03:27:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.