Related papers: LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild

LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild

URL: http://arxiv.org/abs/2402.09997v1
Date: Thu, 15 Feb 2024 15:02:46 GMT
Title: LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild
Authors: Ziyu Zhao, Leilei Gan, Guoyin Wang, Wangchunshu Zhou, Hongxia Yang, Kun Kuang, Fei Wu
Abstract summary: Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM) LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts. Experimental results indicate that LoraRetriever consistently outperforms the baselines.
Score: 76.67343971195267
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Low-Rank Adaptation (LoRA) provides an effective yet efficient solution for fine-tuning large language models (LLM). The modular and plug-and-play nature of LoRA enables the integration of diverse domain-specific LoRAs to enhance the capabilities of LLMs. Previous research on exploiting multiple LoRAs either focuses on specific isolated downstream tasks or fixes the selection of LoRAs during training. However, in real-world scenarios, LLMs receive diverse prompts covering different tasks, and the pool of candidate LoRAs is often dynamically updated. To bridge this gap, we propose LoraRetriever, a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts. LoraRetriever contains three main components: firstly, identifying and retrieving LoRAs relevant to the given input; secondly, formulating strategies for effectively integrating the retrieved LoRAs; and thirdly, developing efficient batch inference to accommodate heterogeneous requests. Experimental results indicate that LoraRetriever consistently outperforms the baselines, highlighting its practical effectiveness and versatility.

Related papers

BeamLoRA: Beam-Constraint Low-Rank Adaptation [51.52097743781401]
Low-Rank Adaptation (LoRA) has been widely adopted as one of the most effective parameter-efficient fine-tuning methods. We propose BeamLoRA, which conceptualizes each LoRA module as a beam where each rank naturally corresponds to a potential sub-solution.
arXiv Detail & Related papers (2025-02-19T10:33:22Z)
Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning [53.98941571078398]
Low-Rank Adaptation (LoRA) is widely used for adapting large language models (LLMs) to specific domains due to its efficiency and modularity. Recent works adopt Mixture of Experts (MoE) by treating each LoRA module as an expert, thereby mitigating task interference through multiple specialized LoRA modules. While effective, these methods often isolate knowledge within individual tasks, failing to fully exploit the shared knowledge across related tasks. We propose Single-ranked Mixture of Experts LoRA (textbfSMoRA), which embeds MoE into LoRA by textittreating each rank as an
arXiv Detail & Related papers (2025-01-25T06:56:39Z)
MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning [74.43869839954168]
We propose MTL-LoRA, which retains the advantages of low-rank adaptation while significantly enhancing multi-task learning capabilities. MTL-LoRA augments LoRA by incorporating additional task-adaptive parameters that differentiate task-specific information. This approach enables large language models (LLMs) pre-trained on general corpus to adapt to different target task domains with a limited number of trainable parameters.
arXiv Detail & Related papers (2024-10-12T08:32:26Z)
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering [35.54018186415654]
Low-Rank Adaptation (LoRA) has emerged as a popular technique for fine-tuning large language models (LLMs) to various domains. Existing methods for LoRA composition primarily focus on task-specific adaptations that require additional training. We introduce the concept of Minimal Semantic Units (MSUs), where the parameters corresponding to each rank in LoRA function as independent units. We propose the LoRA-LEGO framework, which conducts rank-wise parameter clustering by grouping MSUs from different LoRAs into $k$ clusters.
arXiv Detail & Related papers (2024-09-24T15:08:41Z)
LoraMap: Harnessing the Power of LoRA Connections [2.890453474800439]
This paper investigates methods to establish connections among multiple Low-Rank Adaptations (LoRAs) We create three reasoning datasets tailored to fact-checking and fine-tune individual LoRAs. We introduce LoraMap, an approach to map connections between them.
arXiv Detail & Related papers (2024-08-29T05:02:52Z)
Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs) In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z)
MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models [4.978361907192563]
MeteoRA is a scalable and efficient framework that reuses multiple task-specific LoRA adapters into the base LLM. MeteoRA achieves superior performance in handling composite tasks, effectively solving ten sequential problems in a single inference pass.
arXiv Detail & Related papers (2024-05-19T20:46:07Z)
Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection. The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z)
Multi-LoRA Composition for Image Generation [107.83002438126832]
We study multi-LoRA composition through a decoding-centric perspective. We present two training-free methods: LoRA Switch, which alternates between different LoRAs at each denoising step, and LoRA Composite, which simultaneously incorporates all LoRAs to guide more cohesive image synthesis.
arXiv Detail & Related papers (2024-02-26T18:59:18Z)
LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks [72.88244322513039]
LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain. We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs. Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
arXiv Detail & Related papers (2024-02-18T04:41:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.