LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild
- URL: http://arxiv.org/abs/2402.09997v1
- Date: Thu, 15 Feb 2024 15:02:46 GMT
- Title: LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild
- Authors: Ziyu Zhao, Leilei Gan, Guoyin Wang, Wangchunshu Zhou, Hongxia Yang,
Kun Kuang, Fei Wu
- Abstract summary: Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM)
LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts.
Experimental results indicate that LoraRetriever consistently outperforms the baselines.
- Score: 76.67343971195267
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Low-Rank Adaptation (LoRA) provides an effective yet efficient solution for
fine-tuning large language models (LLM). The modular and plug-and-play nature
of LoRA enables the integration of diverse domain-specific LoRAs to enhance the
capabilities of LLMs. Previous research on exploiting multiple LoRAs either
focuses on specific isolated downstream tasks or fixes the selection of LoRAs
during training. However, in real-world scenarios, LLMs receive diverse prompts
covering different tasks, and the pool of candidate LoRAs is often dynamically
updated. To bridge this gap, we propose LoraRetriever, a retrieve-then-compose
framework that adaptively retrieves and composes multiple LoRAs according to
the input prompts. LoraRetriever contains three main components: firstly,
identifying and retrieving LoRAs relevant to the given input; secondly,
formulating strategies for effectively integrating the retrieved LoRAs; and
thirdly, developing efficient batch inference to accommodate heterogeneous
requests. Experimental results indicate that LoraRetriever consistently
outperforms the baselines, highlighting its practical effectiveness and
versatility.
Related papers
- MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning [74.43869839954168]
We propose MTL-LoRA, which retains the advantages of low-rank adaptation while significantly enhancing multi-task learning capabilities.
MTL-LoRA augments LoRA by incorporating additional task-adaptive parameters that differentiate task-specific information.
This approach enables large language models (LLMs) pre-trained on general corpus to adapt to different target task domains with a limited number of trainable parameters.
arXiv Detail & Related papers (2024-10-12T08:32:26Z) - Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering [35.54018186415654]
Low-Rank Adaptation (LoRA) has emerged as a popular technique for fine-tuning large language models (LLMs) to various domains.
Existing methods for LoRA composition primarily focus on task-specific adaptations that require additional training.
We introduce the concept of Minimal Semantic Units (MSUs), where the parameters corresponding to each rank in LoRA function as independent units.
We propose the LoRA-LEGO framework, which conducts rank-wise parameter clustering by grouping MSUs from different LoRAs into $k$ clusters.
arXiv Detail & Related papers (2024-09-24T15:08:41Z) - LoraMap: Harnessing the Power of LoRA Connections [2.890453474800439]
This paper investigates methods to establish connections among multiple Low-Rank Adaptations (LoRAs)
We create three reasoning datasets tailored to fact-checking and fine-tune individual LoRAs.
We introduce LoraMap, an approach to map connections between them.
arXiv Detail & Related papers (2024-08-29T05:02:52Z) - Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs)
In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z) - MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models [4.978361907192563]
MeteoRA is a scalable and efficient framework that reuses multiple task-specific LoRA adapters into the base LLM.
MeteoRA achieves superior performance in handling composite tasks, effectively solving ten sequential problems in a single inference pass.
arXiv Detail & Related papers (2024-05-19T20:46:07Z) - Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection.
The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z) - LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks [72.88244322513039]
LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain.
We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs.
Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
arXiv Detail & Related papers (2024-02-18T04:41:25Z) - CA-LoRA: Adapting Existing LoRA for Compressed LLMs to Enable Efficient Multi-Tasking on Personal Devices [78.16679232748196]
We introduce a Compression-Aware LoRA (CA-LoRA) framework to transfer Large Language Models (LLMs) to other tasks.
Experiment results demonstrate that CA-LoRA outperforms the vanilla LoRA methods applied to a compressed LLM.
The source code of CA-LoRA is available at https://github.com/thunlp/CA-LoRA.
arXiv Detail & Related papers (2023-07-15T04:37:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.