LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks
- URL: http://arxiv.org/abs/2402.11455v1
- Date: Sun, 18 Feb 2024 04:41:25 GMT
- Title: LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks
- Authors: Hanqing Wang, Bowen Ping, Shuo Wang, Xu Han, Yun Chen, Zhiyuan Liu,
Maosong Sun
- Abstract summary: LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain.
We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs.
Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
- Score: 72.88244322513039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LoRA employs lightweight modules to customize large language models (LLMs)
for each downstream task or domain, where different learned additional modules
represent diverse skills. Combining existing LoRAs to address new tasks can
enhance the reusability of learned LoRAs, particularly beneficial for tasks
with limited annotated data. Most prior works on LoRA combination primarily
rely on task-level weights for each involved LoRA, making different examples
and tokens share the same LoRA weights. However, in generative tasks, different
tokens may necessitate diverse skills to manage. Taking the Chinese math task
as an example, understanding the problem description may depend more on the
Chinese LoRA, while the calculation part may rely more on the math LoRA. To
this end, we propose LoRA-Flow, which utilizes dynamic weights to adjust the
impact of different LoRAs. The weights at each step are determined by a fusion
gate with extremely few parameters, which can be learned with only 200 training
examples. Experiments across six generative tasks demonstrate that our method
consistently outperforms baselines with task-level fusion weights. This
underscores the necessity of introducing dynamic fusion weights for LoRA
combination.
Related papers
- LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks [73.09643674975591]
Low-Rank Adaptation (LoRA) is a technique for parameter-efficient fine-tuning of Large Language Models (LLMs)
We study how different LoRA modules can be merged to achieve skill composition.
arXiv Detail & Related papers (2024-10-16T20:33:06Z) - Learning Attentional Mixture of LoRAs for Language Model Continual Learning [5.405488709294211]
Fine-tuning large language models (LLMs) with Low-Rank adaption (LoRA) is widely acknowledged as an effective approach for continual learning for new tasks.
We propose Attentional Mixture of LoRAs (AM-LoRA), a continual learning approach tailored for LLMs.
arXiv Detail & Related papers (2024-09-29T08:34:54Z) - Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs)
In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z) - Mixture of LoRA Experts [87.50120181861362]
This paper introduces the Mixture of LoRA Experts (MoLE) approach, which harnesses hierarchical control and unfettered branch selection.
The MoLE approach achieves superior LoRA fusion performance in comparison to direct arithmetic merging.
arXiv Detail & Related papers (2024-04-21T11:59:53Z) - Multi-LoRA Composition for Image Generation [107.83002438126832]
We study multi-LoRA composition through a decoding-centric perspective.
We present two training-free methods: LoRA Switch, which alternates between different LoRAs at each denoising step, and LoRA Composite, which simultaneously incorporates all LoRAs to guide more cohesive image synthesis.
arXiv Detail & Related papers (2024-02-26T18:59:18Z) - LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild [76.67343971195267]
Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM)
LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts.
Experimental results indicate that LoraRetriever consistently outperforms the baselines.
arXiv Detail & Related papers (2024-02-15T15:02:46Z) - MultiLoRA: Democratizing LoRA for Better Multi-Task Learning [20.750808913757396]
LoRA achieves remarkable resource efficiency and comparable performance when adapting LLMs for specific tasks.
LoRA is dominated by a small number of top singular vectors while fine-tuning decomposes into a set of less important unitary transforms.
We propose MultiLoRA for better multi-task adaptation by reducing the dominance of top singular vectors observed in LoRA.
arXiv Detail & Related papers (2023-11-20T02:59:18Z) - DyLoRA: Parameter Efficient Tuning of Pre-trained Models using Dynamic
Search-Free Low-Rank Adaptation [18.922066770467914]
Low-rank adapters (LoRA) keep the main pretrained weights of the model frozen and just introduce some learnable truncated SVD modules to the model.
While LoRA blocks are parameter-efficient, they suffer from two major problems: first, the size of these blocks is fixed and cannot be modified after training.
We introduce a dynamic low-rank adaptation (DyLoRA) technique to address these two problems together.
arXiv Detail & Related papers (2022-10-14T06:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.