LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA
Composition
- URL: http://arxiv.org/abs/2307.13269v2
- Date: Thu, 18 Jan 2024 06:53:39 GMT
- Title: LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA
Composition
- Authors: Chengsong Huang, Qian Liu, Bill Yuchen Lin, Tianyu Pang, Chao Du, Min
Lin
- Abstract summary: Low-rank adaptations (LoRA) are often employed to fine-tune large language models (LLMs) for new tasks.
This paper introduces LoraHub, a framework devised for the purposive assembly of LoRA modules trained on diverse given tasks.
With just a few examples from a new task, LoraHub can fluidly combine multiple LoRA modules, eliminating the need for human expertise and assumptions.
- Score: 46.770388457085936
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Low-rank adaptations (LoRA) are often employed to fine-tune large language
models (LLMs) for new tasks. This paper investigates LoRA composability for
cross-task generalization and introduces LoraHub, a simple framework devised
for the purposive assembly of LoRA modules trained on diverse given tasks, with
the objective of achieving adaptable performance on unseen tasks. With just a
few examples from a new task, LoraHub can fluidly combine multiple LoRA
modules, eliminating the need for human expertise and assumptions. Notably, the
composition requires neither additional model parameters nor gradients.
Empirical results on the Big-Bench Hard benchmark suggest that LoraHub, while
not surpassing the performance of in-context learning, offers a notable
performance-efficiency trade-off in few-shot scenarios by employing a
significantly reduced number of tokens per example during inference. Notably,
LoraHub establishes a better upper bound compared to in-context learning when
paired with different demonstration examples, demonstrating its potential for
future development. Our vision is to establish a platform for LoRA modules,
empowering users to share their trained LoRA modules. This collaborative
approach facilitates the seamless application of LoRA modules to novel tasks,
contributing to an adaptive ecosystem. Our code is available at
https://github.com/sail-sg/lorahub, and all the pre-trained LoRA modules are
released at https://huggingface.co/lorahub.
Related papers
- Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs)
In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z) - LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks [72.88244322513039]
LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain.
We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs.
Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
arXiv Detail & Related papers (2024-02-18T04:41:25Z) - LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild [76.67343971195267]
Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM)
LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts.
Experimental results indicate that LoraRetriever consistently outperforms the baselines.
arXiv Detail & Related papers (2024-02-15T15:02:46Z) - Chain of LoRA: Efficient Fine-tuning of Language Models via Residual
Learning [31.036465632204663]
We introduce Chain of LoRA, an iterative optimization framework inspired by the Frank-Wolfe algorithm.
We demonstrate that COLA can consistently outperform LoRA without additional computational or memory costs.
arXiv Detail & Related papers (2024-01-08T14:26:49Z) - MultiLoRA: Democratizing LoRA for Better Multi-Task Learning [20.750808913757396]
LoRA achieves remarkable resource efficiency and comparable performance when adapting LLMs for specific tasks.
LoRA is dominated by a small number of top singular vectors while fine-tuning decomposes into a set of less important unitary transforms.
We propose MultiLoRA for better multi-task adaptation by reducing the dominance of top singular vectors observed in LoRA.
arXiv Detail & Related papers (2023-11-20T02:59:18Z) - S-LoRA: Serving Thousands of Concurrent LoRA Adapters [59.490751234925206]
Low-Rank Adaptation (LoRA), a parameter-efficient fine-tuning method, is often employed to adapt a base model to a multitude of tasks.
We present S-LoRA, a system designed for the scalable serving of many LoRA adapters.
arXiv Detail & Related papers (2023-11-06T17:26:17Z) - LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models [67.58275666573496]
LongLoRA is an efficient fine-tuning approach that extends the context sizes of pre-trained large language models.
We demonstrate strong empirical results on various tasks on Llama2 models from 7B/13B to 70B.
arXiv Detail & Related papers (2023-09-21T17:59:11Z) - DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic
Search-Free Low-Rank Adaptation [18.922066770467914]
Low-rank adapters (LoRA) keep the main pretrained weights of the model frozen and just introduce some learnable truncated SVD modules to the model.
While LoRA blocks are parameter-efficient, they suffer from two major problems: first, the size of these blocks is fixed and cannot be modified after training.
We introduce a dynamic low-rank adaptation (DyLoRA) technique to address these two problems together.
arXiv Detail & Related papers (2022-10-14T06:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.