LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks
- URL: http://arxiv.org/abs/2402.11455v1
- Date: Sun, 18 Feb 2024 04:41:25 GMT
- Title: LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative
Tasks
- Authors: Hanqing Wang, Bowen Ping, Shuo Wang, Xu Han, Yun Chen, Zhiyuan Liu,
Maosong Sun
- Abstract summary: LoRA employs lightweight modules to customize large language models (LLMs) for each downstream task or domain.
We propose LoRA-Flow, which utilizes dynamic weights to adjust the impact of different LoRAs.
Experiments across six generative tasks demonstrate that our method consistently outperforms baselines with task-level fusion weights.
- Score: 72.88244322513039
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LoRA employs lightweight modules to customize large language models (LLMs)
for each downstream task or domain, where different learned additional modules
represent diverse skills. Combining existing LoRAs to address new tasks can
enhance the reusability of learned LoRAs, particularly beneficial for tasks
with limited annotated data. Most prior works on LoRA combination primarily
rely on task-level weights for each involved LoRA, making different examples
and tokens share the same LoRA weights. However, in generative tasks, different
tokens may necessitate diverse skills to manage. Taking the Chinese math task
as an example, understanding the problem description may depend more on the
Chinese LoRA, while the calculation part may rely more on the math LoRA. To
this end, we propose LoRA-Flow, which utilizes dynamic weights to adjust the
impact of different LoRAs. The weights at each step are determined by a fusion
gate with extremely few parameters, which can be learned with only 200 training
examples. Experiments across six generative tasks demonstrate that our method
consistently outperforms baselines with task-level fusion weights. This
underscores the necessity of introducing dynamic fusion weights for LoRA
combination.
Related papers
- In-Context Meta LoRA Generation [61.690065588534296]
Low-rank Adaptation (LoRA) has demonstrated remarkable capabilities for task specific fine-tuning.
We propose In-Context Meta LoRA (ICM-LoRA), a novel approach that efficiently achieves task-specific customization of large language models.
ICM-LoRA enables more accurate LoRA parameter reconstruction than current parameter reconstruction methods.
arXiv Detail & Related papers (2025-01-29T13:12:01Z) - Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning [53.98941571078398]
Low-Rank Adaptation (LoRA) is widely used for adapting large language models (LLMs) to specific domains due to its efficiency and modularity.
Recent works adopt Mixture of Experts (MoE) by treating each LoRA module as an expert, thereby mitigating task interference through multiple specialized LoRA modules.
While effective, these methods often isolate knowledge within individual tasks, failing to fully exploit the shared knowledge across related tasks.
We propose Single-ranked Mixture of Experts LoRA (textbfSMoRA), which embeds MoE into LoRA by textittreating each rank as an
arXiv Detail & Related papers (2025-01-25T06:56:39Z) - LoRA Soups: Merging LoRAs for Practical Skill Composition Tasks [73.09643674975591]
Low-Rank Adaptation (LoRA) is a technique for parameter-efficient fine-tuning of Large Language Models (LLMs)
We study how different LoRA modules can be merged to achieve skill composition.
arXiv Detail & Related papers (2024-10-16T20:33:06Z) - Learning Attentional Mixture of LoRAs for Language Model Continual Learning [5.405488709294211]
Fine-tuning large language models (LLMs) with Low-Rank adaption (LoRA) is widely acknowledged as an effective approach for continual learning for new tasks.
We propose Attentional Mixture of LoRAs (AM-LoRA), a continual learning approach tailored for LLMs.
arXiv Detail & Related papers (2024-09-29T08:34:54Z) - Retrieval-Augmented Mixture of LoRA Experts for Uploadable Machine Learning [57.36978335727009]
Low-Rank Adaptation (LoRA) offers an efficient way to fine-tune large language models (LLMs)
In this paper, we propose a framework that adaptively retrieves and composes multiple LoRAs based on input prompts.
arXiv Detail & Related papers (2024-06-24T05:24:41Z) - LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
Tasks in the Wild [76.67343971195267]
Low-Rank Adaptation (LoRA) provides an efficient solution for fine-tuning large language models (LLM)
LoraRetriever is a retrieve-then-compose framework that adaptively retrieves and composes multiple LoRAs according to the input prompts.
Experimental results indicate that LoraRetriever consistently outperforms the baselines.
arXiv Detail & Related papers (2024-02-15T15:02:46Z) - MultiLoRA: Democratizing LoRA for Better Multi-Task Learning [20.750808913757396]
LoRA achieves remarkable resource efficiency and comparable performance when adapting LLMs for specific tasks.
LoRA is dominated by a small number of top singular vectors while fine-tuning decomposes into a set of less important unitary transforms.
We propose MultiLoRA for better multi-task adaptation by reducing the dominance of top singular vectors observed in LoRA.
arXiv Detail & Related papers (2023-11-20T02:59:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.