A Framework to Implement 1+N Multi-task Fine-tuning Pattern in LLMs
Using the CGC-LORA Algorithm
- URL: http://arxiv.org/abs/2402.01684v1
- Date: Mon, 22 Jan 2024 07:58:31 GMT
- Title: A Framework to Implement 1+N Multi-task Fine-tuning Pattern in LLMs
Using the CGC-LORA Algorithm
- Authors: Chao Song and Zhihao Ye and Qiqiang Lin and Qiuying Peng and Jun Wang
- Abstract summary: We propose a unified framework that implements a 1 + N mutli-task fine-tuning pattern in large language models (LLMs)
Our work aims to take an advantage of both MTL (i.e., CGC) and PEFT (i.e., LoRA) scheme.
- Score: 7.521690071464451
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the productive evolution of large language models (LLMs) in the field of
natural language processing (NLP), tons of effort has been made to effectively
fine-tune common pre-trained LLMs to fulfill a variety of tasks in one or
multiple specific domain. In practice, there are two prevailing ways, in which
the adaptation can be achieved: (i) Multiple Independent Models: Pre-trained
LLMs are fine-tuned a few times independently using the corresponding training
samples from each task. (ii) An Integrated Model: Samples from all tasks are
employed to fine-tune a pre-trianed LLM unitedly. To address the high computing
cost and seesawing issue simultaneously, we propose a unified framework that
implements a 1 + N mutli-task fine-tuning pattern in LLMs using a novel
Customized Gate Control (CGC) Low-rank Adaptation (LoRA) algorithm. Our work
aims to take an advantage of both MTL (i.e., CGC) and PEFT (i.e., LoRA) scheme.
For a given cluster of tasks, we design an innovative layer that contains two
types of experts as additional trainable parameters to make LoRA be compatible
with MTL. To comprehensively evaluate the proposed framework, we conduct
well-designed experiments on two public datasets. The experimental results
demonstrate that the unified framework with CGC-LoRA modules achieves higher
evaluation scores than all benchmarks on both two datasets.
Related papers
- MIRA: A Method of Federated MultI-Task Learning for LaRge LAnguage Models [29.655807841018497]
We introduce a method for fine-tuning Large Language Models (LLMs)
Our approach leverages the structure of each client's model and enables a learning scheme that considers other clients' tasks and data distribution.
Experimental results, with different datasets and models, demonstrate the proposed method's effectiveness.
arXiv Detail & Related papers (2024-10-20T22:24:40Z) - MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning [74.43869839954168]
We propose MTL-LoRA, which retains the advantages of low-rank adaptation while significantly enhancing multi-task learning capabilities.
MTL-LoRA augments LoRA by incorporating additional task-adaptive parameters that differentiate task-specific information.
This approach enables large language models (LLMs) pre-trained on general corpus to adapt to different target task domains with a limited number of trainable parameters.
arXiv Detail & Related papers (2024-10-12T08:32:26Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic [6.46176287368784]
We propose textbfModel textbfExclusive textbfTask textbfArithmetic for merging textbfGPT-scale models.
Our proposed MetaGPT is data-agnostic and bypasses the heavy search process, making it cost-effective and easy to implement for LLMs.
arXiv Detail & Related papers (2024-06-17T10:12:45Z) - CALRec: Contrastive Alignment of Generative LLMs for Sequential Recommendation [18.986613405565514]
Large Language Models (LLMs) are pretrained on vast corpora of text for sequential recommendation.
We propose a two-stage LLM finetuning framework that finetunes a pretrained LLM in a two-tower fashion using a mixture of two contrastive losses and a language modeling loss.
Our model significantly outperforms many state-of-the-art baselines.
arXiv Detail & Related papers (2024-05-03T18:51:19Z) - Multimodal Instruction Tuning with Conditional Mixture of LoRA [54.65520214291653]
This paper introduces a novel approach that integrates multimodal instruction tuning with Low-Rank Adaption (LoRA)
It innovates upon LoRA by dynamically constructing low-rank adaptation matrices tailored to the unique demands of each input instance.
Experimental results on various multimodal evaluation datasets indicate that MixLoRA not only outperforms the conventional LoRA with the same or even higher ranks.
arXiv Detail & Related papers (2024-02-24T20:15:31Z) - Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs.
We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer.
This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z) - Large Language Models aren't all that you need [0.0]
This paper describes the architecture and systems built towards solving the SemEval 2023 Task 2: MultiCoNER II.
We evaluate two approaches (a) a traditional Random Fields model and (b) a Large Language Model (LLM) fine-tuned with a customized head and compare the two approaches.
arXiv Detail & Related papers (2024-01-01T08:32:50Z) - FederatedScope-LLM: A Comprehensive Package for Fine-tuning Large
Language Models in Federated Learning [70.38817963253034]
This paper first discusses these challenges of federated fine-tuning LLMs, and introduces our package FS-LLM as a main contribution.
We provide comprehensive federated parameter-efficient fine-tuning algorithm implementations and versatile programming interfaces for future extension in FL scenarios.
We conduct extensive experiments to validate the effectiveness of FS-LLM and benchmark advanced LLMs with state-of-the-art parameter-efficient fine-tuning algorithms in FL settings.
arXiv Detail & Related papers (2023-09-01T09:40:36Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.