LoRA on the Go: Instance-level Dynamic LoRA Selection and Merging
- URL: http://arxiv.org/abs/2511.07129v1
- Date: Mon, 10 Nov 2025 14:13:10 GMT
- Title: LoRA on the Go: Instance-level Dynamic LoRA Selection and Merging
- Authors: Seungeon Lee, Soumi Das, Manish Gupta, Krishna P. Gummadi,
- Abstract summary: Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for fine-tuning large language models.<n>LoGo is a training-free framework that dynamically selects and merges adapters at the instance level without any additional requirements.<n>LoGo outperforms training-based baselines on some tasks upto a margin of 3.6% while remaining competitive on other tasks.
- Score: 9.68092924064735
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Low-Rank Adaptation (LoRA) has emerged as a parameter-efficient approach for fine-tuning large language models.However, conventional LoRA adapters are typically trained for a single task, limiting their applicability in real-world settings where inputs may span diverse and unpredictable domains. At inference time, existing approaches combine multiple LoRAs for improving performance on diverse tasks, while usually requiring labeled data or additional task-specific training, which is expensive at scale. In this work, we introduce LoRA on the Go (LoGo), a training-free framework that dynamically selects and merges adapters at the instance level without any additional requirements. LoGo leverages signals extracted from a single forward pass through LoRA adapters, to identify the most relevant adapters and determine their contributions on-the-fly. Across 5 NLP benchmarks, 27 datasets, and 3 model families, LoGo outperforms training-based baselines on some tasks upto a margin of 3.6% while remaining competitive on other tasks and maintaining inference throughput, highlighting its effectiveness and practicality.
Related papers
- tLoRA: Efficient Multi-LoRA Training with Elastic Shared Super-Models [8.42285475305854]
tLoRA is a framework that enables efficient batch training of multiple LoRA jobs.<n> Evaluations using real-world cluster traces demonstrate that tLoRA improves training by 1.2--1.8x, job training completion time by 2.3--5.4x, and GPU utilization by 37%.
arXiv Detail & Related papers (2026-02-06T23:26:02Z) - SEQR: Secure and Efficient QR-based LoRA Routing [53.52716967527183]
Low-Rank Adaptation (LoRA) has become a standard technique for parameter-efficient fine-tuning of large language models.<n> Efficiently selecting the correct LoRA adapter for a given input remains a challenge.<n>We introduce SEQR, an unsupervised LoRA routing algorithm designed to maximize efficiency while providing strict routing guarantees.
arXiv Detail & Related papers (2025-09-22T17:59:38Z) - In-Context Meta LoRA Generation [61.690065588534296]
Low-rank Adaptation (LoRA) has demonstrated remarkable capabilities for task specific fine-tuning.<n>We propose In-Context Meta LoRA (ICM-LoRA), a novel approach that efficiently achieves task-specific customization of large language models.<n>ICM-LoRA enables more accurate LoRA parameter reconstruction than current parameter reconstruction methods.
arXiv Detail & Related papers (2025-01-29T13:12:01Z) - Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs [76.40876036912537]
Large Language Models (LLMs) demonstrate strong few-shot adaptability without requiring fine-tuning.<n>Current Visual Foundation Models (VFMs) require explicit fine-tuning with sufficient tuning data.<n>We propose a framework, LoRA Recycle, that distills a meta-LoRA from diverse pre-tuned LoRAs with a meta-learning objective.
arXiv Detail & Related papers (2024-12-03T07:25:30Z) - DLP-LoRA: Efficient Task-Specific LoRA Fusion with a Dynamic, Lightweight Plugin for Large Language Models [10.179598253424103]
Large Language Models (LLMs) have achieved robust performance across diverse tasks, but fine-tuning these models for specific domains remains resource-intensive.<n>We propose a mini-MLP module with only 5M parameters to dynamically fuse multiple LoRAs at the sentence level using top-p sampling strategies.<n>This approach reduces inference time to less than twice that of single LoRA inference by leveraging parallel computation.
arXiv Detail & Related papers (2024-10-02T12:45:52Z) - Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs)
In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z) - MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models [4.978361907192563]
MeteoRA is a scalable and efficient framework that reuses multiple task-specific LoRA adapters into the base LLM.
MeteoRA achieves superior performance in handling composite tasks, effectively solving ten sequential problems in a single inference pass.
arXiv Detail & Related papers (2024-05-19T20:46:07Z) - LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks [72.88244322513039]
LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain.
We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs.
Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
arXiv Detail & Related papers (2024-02-18T04:41:25Z) - LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild [76.67343971195267]
Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM)
LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts.
Experimental results indicate that LoraRetriever consistently outperforms the baselines.
arXiv Detail & Related papers (2024-02-15T15:02:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.