Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning
- URL: http://arxiv.org/abs/2506.05568v1
- Date: Thu, 05 Jun 2025 20:28:02 GMT
- Title: Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning
- Authors: Arian Raje, Baris Askin, Divyansh Jhunjhunwala, Gauri Joshi,
- Abstract summary: We present textscRavan, an adaptive multi-head LoRA method that balances parameter efficiency and model expressivity.<n>Experiments on vision and language benchmarks show that textscRavan improves test accuracy by 2-8% over prior parameter-efficient baselines.
- Score: 16.99490636203893
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) have not yet effectively leveraged the vast amounts of edge-device data, and federated learning (FL) offers a promising paradigm to collaboratively fine-tune LLMs without transferring private edge data to the cloud. To operate within the computation and communication constraints of edge devices, recent literature on federated fine-tuning of LLMs proposes the use of low-rank adaptation (LoRA) and similar parameter-efficient methods. However, LoRA-based methods suffer from accuracy degradation in FL settings, primarily because of data and computational heterogeneity across clients. We propose \textsc{Ravan}, an adaptive multi-head LoRA method that balances parameter efficiency and model expressivity by reparameterizing the weight updates as the sum of multiple LoRA heads $s_i\textbf{B}_i\textbf{H}_i\textbf{A}_i$ in which only the core matrices $\textbf{H}_i$ and their lightweight scaling factors $s_i$ are trained. These trainable scaling factors let the optimization focus on the most useful heads, recovering a higher-rank approximation of the full update without increasing the number of communicated parameters since clients upload $s_i\textbf{H}_i$ directly. Experiments on vision and language benchmarks show that \textsc{Ravan} improves test accuracy by 2-8\% over prior parameter-efficient baselines, making it a robust and scalable solution for federated fine-tuning of LLMs.
Related papers
- Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights [75.83625828306839]
textbfDrag-and-Drop LLMs (textitDnD) eliminates per-task training by mapping a handful of unlabeled task prompts directly to LoRA weight updates.<n>A lightweight text encoder distills each prompt batch into condition embeddings, which are then transformed by a cascaded hyper-convolutional decoder into the full set of LoRA matrices.
arXiv Detail & Related papers (2025-06-19T15:38:21Z) - WeightLoRA: Keep Only Necessary Adapters [79.89637596855]
Low-rank adaptation ($texttLoRA$) adds trainable adapters to selected layers.<n>We propose a novel method, $textttWeightLoRA$, which overcomes this issue by adaptive selection of the most critical $textttLoRA$ heads.<n>We conduct experiments for a series of competitive benchmarks and DeBERTa, BART, and Llama models, comparing our method with different adaptive approaches.
arXiv Detail & Related papers (2025-06-03T10:33:16Z) - A Sensitivity-Driven Expert Allocation Method in LoRA-MoE for Efficient Fine-Tuning [0.6906005491572401]
We propose a method for allocating expert numbers based on parameter sensitivity LoRA-SMoE.<n> Experimental results demonstrate that our LoRA-SMoE approach can enhance model performance while reducing the number of trainable parameters.
arXiv Detail & Related papers (2025-05-06T13:22:46Z) - HSplitLoRA: A Heterogeneous Split Parameter-Efficient Fine-Tuning Framework for Large Language Models [30.345920952847752]
Large language models (LLMs) have achieved remarkable breakthroughs, revolutionizing the natural language processing domain and beyond.<n>Due to immense parameter sizes, fine-tuning these models with private data for diverse downstream tasks has become mainstream.<n>We propose HSplitLoRA, a framework built on split learning (SL) and low-rank adaptation (LoRA) fine-tuning, for efficiently fine-tuning LLMs on heterogeneous client devices.
arXiv Detail & Related papers (2025-05-05T17:09:19Z) - PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning [54.99373314906667]
Self-supervised representation learning for point cloud has demonstrated effectiveness in improving pre-trained model performance across diverse tasks.<n>As pre-trained models grow in complexity, fully fine-tuning them for downstream applications demands substantial computational and storage resources.<n>We propose PointLoRA, a simple yet effective method that combines low-rank adaptation (LoRA) with multi-scale token selection to efficiently fine-tune point cloud models.
arXiv Detail & Related papers (2025-04-22T16:41:21Z) - Communication-Efficient and Personalized Federated Foundation Model Fine-Tuning via Tri-Matrix Adaptation [47.82423317739088]
This paper introduces communication-efficient federated LoRA adaption (CE-LoRA), a method that employs a tri-factorization low-rank adaptation approach with personalized model parameter aggregation.<n>Experiments on various LLM and VLM fine-tuning tasks demonstrate that CE-LoRA not only significantly reduces communication overhead but also improves performance under not independently and identically distributed data conditions.
arXiv Detail & Related papers (2025-03-31T09:18:42Z) - MSPLoRA: A Multi-Scale Pyramid Low-Rank Adaptation for Efficient Model Fine-Tuning [5.412348391086257]
We propose MSPLoRA, which introduces Global Shared LoRA, Mid-Level Shared LoRA, and Layer-Specific LoRA to capture global patterns, mid-level features, and fine-grained information.<n> Experiments on various NLP tasks demonstrate that MSPLoRA achieves more efficient adaptation and better performance while significantly reducing the number of trainable parameters.
arXiv Detail & Related papers (2025-03-27T07:01:50Z) - LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement [5.162783756846019]
Foundation models (FMs) achieve strong performance across diverse tasks with task-specific fine-tuning.<n>Low-Rank Adaptation (LoRA) methods like Low-Rank Adaptation (LoRA) reduce this cost by introducing low-rank matrices for tuning fewer parameters.<n>LoRA-FAIR maintains computational and communication efficiency, yielding superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2024-11-22T14:19:01Z) - HAFLQ: Heterogeneous Adaptive Federated LoRA Fine-tuned LLM with Quantization [55.972018549438964]
Federated fine-tuning of pre-trained Large Language Models (LLMs) enables task-specific adaptation across diverse datasets while preserving privacy.<n>We propose HAFLQ (Heterogeneous Adaptive Federated Low-Rank Adaptation Fine-tuned LLM with Quantization), a novel framework for efficient and scalable fine-tuning of LLMs in heterogeneous environments.<n> Experimental results on the text classification task demonstrate that HAFLQ reduces memory usage by 31%, lowers communication cost by 49%, improves accuracy by 50%, and achieves faster convergence compared to the baseline method.
arXiv Detail & Related papers (2024-11-10T19:59:54Z) - Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs [75.11449420928139]
Fine-tuning Large Language Models (LLMs) has become a crucial technique for adapting pre-trained models to downstream tasks.
Low-Rank Adaptation (LoRA) has emerged as a promising solution, but there exists a gap between the practical performance of low-rank adaptations and its theoretical optimum.
We propose eXtreme Gradient Boosting LoRA, a novel framework that bridges this gap by leveraging the power of ensemble learning.
arXiv Detail & Related papers (2024-10-25T17:07:13Z) - Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
Learning [31.036465632204663]
We introduce Chain of LoRA, an iterative optimization framework inspired by the Frank-Wolfe algorithm.
We demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
arXiv Detail & Related papers (2024-01-08T14:26:49Z) - Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes [53.4856038354195]
Pre-trained large language models (LLMs) need fine-tuning to improve their responsiveness to natural language instructions.
FedKSeed employs zeroth-order optimization with a finite set of random seeds.
It significantly reduces transmission requirements between the server and clients to just a few random seeds.
arXiv Detail & Related papers (2023-12-11T13:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.