LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models
- URL: http://arxiv.org/abs/2503.16843v1
- Date: Fri, 21 Mar 2025 04:31:09 GMT
- Title: LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models
- Authors: Jian Liang, Wenke Huang, Guancheng Wan, Qu Yang, Mang Ye,
- Abstract summary: Low-Rank Adaptation (LoRA) is widely used to efficiently acquire specialized knowledge in Multimodal Large Language Models (MLLMs)<n>LoRA introduces substantial harmful redundancy during visual instruction tuning, which exacerbates the forgetting of general knowledge and degrades downstream task performance.<n>We propose LoRASculpt to eliminate harmful redundant parameters, thereby harmonizing general and specialized knowledge.
- Score: 61.96237184081951
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While Multimodal Large Language Models (MLLMs) excel at generalizing across modalities and tasks, effectively adapting them to specific downstream tasks while simultaneously retaining both general and specialized knowledge remains challenging. Although Low-Rank Adaptation (LoRA) is widely used to efficiently acquire specialized knowledge in MLLMs, it introduces substantial harmful redundancy during visual instruction tuning, which exacerbates the forgetting of general knowledge and degrades downstream task performance. To address this issue, we propose LoRASculpt to eliminate harmful redundant parameters, thereby harmonizing general and specialized knowledge. Specifically, under theoretical guarantees, we introduce sparse updates into LoRA to discard redundant parameters effectively. Furthermore, we propose a Conflict Mitigation Regularizer to refine the update trajectory of LoRA, mitigating knowledge conflicts with the pretrained weights. Extensive experimental results demonstrate that even at very high degree of sparsity ($\le$ 5%), our method simultaneously enhances generalization and downstream task performance. This confirms that our approach effectively mitigates the catastrophic forgetting issue and further promotes knowledge harmonization in MLLMs.
Related papers
- Each Rank Could be an Expert: Single-Ranked Mixture of Experts LoRA for Multi-Task Learning [53.98941571078398]
Low-Rank Adaptation (LoRA) is widely used for adapting large language models (LLMs) to specific domains due to its efficiency and modularity.
Recent works adopt Mixture of Experts (MoE) by treating each LoRA module as an expert, thereby mitigating task interference through multiple specialized LoRA modules.
While effective, these methods often isolate knowledge within individual tasks, failing to fully exploit the shared knowledge across related tasks.
We propose Single-ranked Mixture of Experts LoRA (textbfSMoRA), which embeds MoE into LoRA by textittreating each rank as an
arXiv Detail & Related papers (2025-01-25T06:56:39Z) - MoSLD: An Extremely Parameter-Efficient Mixture-of-Shared LoRAs for Multi-Task Learning [8.868481107848185]
MoSLD is a mixture-of-shared-LoRAs model with a dropout strategy.
MoSLD addresses challenges by sharing the upper projection matrix in LoRA among different experts.
Our model exhibits excellent performance in both single-task and multi-task scenarios.
arXiv Detail & Related papers (2024-12-12T05:22:49Z) - Learn from Downstream and Be Yourself in Multimodal Large Language Model Fine-Tuning [104.27224674122313]
Fine-tuning MLLM has become a common practice to improve performance on specific downstream tasks.
To balance the trade-off between generalization and specialization, we propose measuring the parameter importance for both pre-trained and fine-tuning distributions.
arXiv Detail & Related papers (2024-11-17T01:16:37Z) - MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning [29.957620178740186]
In multi-task scenarios, challenges such as training imbalance and the seesaw effect frequently emerge.
We propose Mixture of Asymmetric Low-Rank Adaptaion (MALoRA) as a flexible fine-tuning framework.
MALoRA reduces the number of trainable parameters by 30% to 48%, increases training speed by 1.2x, and matches the computational efficiency of single-task LoRA models.
arXiv Detail & Related papers (2024-10-30T07:53:52Z) - MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning [74.43869839954168]
We propose MTL-LoRA, which retains the advantages of low-rank adaptation while significantly enhancing multi-task learning capabilities.
MTL-LoRA augments LoRA by incorporating additional task-adaptive parameters that differentiate task-specific information.
This approach enables large language models (LLMs) pre-trained on general corpus to adapt to different target task domains with a limited number of trainable parameters.
arXiv Detail & Related papers (2024-10-12T08:32:26Z) - Learning Attentional Mixture of LoRAs for Language Model Continual Learning [5.405488709294211]
Fine-tuning large language models (LLMs) with Low-Rank adaption (LoRA) is widely acknowledged as an effective approach for continual learning for new tasks.
We propose Attentional Mixture of LoRAs (AM-LoRA), a continual learning approach tailored for LLMs.
arXiv Detail & Related papers (2024-09-29T08:34:54Z) - Multimodal Instruction Tuning with Conditional Mixture of LoRA [51.58020580970644]
This paper introduces a novel approach that integrates multimodal instruction tuning with Low-Rank Adaption (LoRA)
It innovates upon LoRA by dynamically constructing low-rank adaptation matrices tailored to the unique demands of each input instance.
Experimental results on various multimodal evaluation datasets indicate that MixLoRA not only outperforms the conventional LoRA with the same or even higher ranks.
arXiv Detail & Related papers (2024-02-24T20:15:31Z) - LoRAMoE: Alleviate World Knowledge Forgetting in Large Language Models
via MoE-Style Plugin [85.16356890023582]
We propose LoRAMoE, a novelty framework that introduces several low-rank adapters (LoRA) and integrates them by using a router network.
It freezes the backbone model and forces a portion of LoRAs to focus on leveraging world knowledge to solve downstream tasks.
Experimental results show that, as the instruction data increases, LoRAMoE can significantly improve the ability to process downstream tasks.
arXiv Detail & Related papers (2023-12-15T17:45:06Z) - Pay Attention to What You Need [8.369701050186867]
Large language models (LLMs) struggle with long-context comprehension.
We propose a method called Scaled ReAttention (SRA) to strengthen LLMs' ability to interpret and retrieve information.
arXiv Detail & Related papers (2023-07-25T09:34:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.