CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large
Language Model
- URL: http://arxiv.org/abs/2403.08350v1
- Date: Wed, 13 Mar 2024 08:54:31 GMT
- Title: CoIN: A Benchmark of Continual Instruction tuNing for Multimodel Large
Language Model
- Authors: Cheng Chen and Junchen Zhu and Xu Luo and Hengtao Shen and Lianli Gao
and Jingkuan Song
- Abstract summary: We present a benchmark, namely Continual Instruction tuNing (CoIN), to assess existing MLLMs in the sequential instruction tuning paradigm.
Experiments on CoIN demonstrate that current powerful MLLMs still suffer catastrophic forgetting.
We introduce MoELoRA to MLLMs which is effective to retain the previous instruction alignment.
- Score: 128.46104068327435
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Instruction tuning represents a prevalent strategy employed by Multimodal
Large Language Models (MLLMs) to align with human instructions and adapt to new
tasks. Nevertheless, MLLMs encounter the challenge of adapting to users'
evolving knowledge and demands. Therefore, how to retain existing skills while
acquiring new knowledge needs to be investigated. In this paper, we present a
comprehensive benchmark, namely Continual Instruction tuNing (CoIN), to assess
existing MLLMs in the sequential instruction tuning paradigm. CoIN comprises 10
commonly used datasets spanning 8 task categories, ensuring a diverse range of
instructions and tasks. Besides, the trained model is evaluated from two
aspects: Instruction Following and General Knowledge, which assess the
alignment with human intention and knowledge preserved for reasoning,
respectively. Experiments on CoIN demonstrate that current powerful MLLMs still
suffer catastrophic forgetting, and the failure in intention alignment assumes
the main responsibility, instead of the knowledge forgetting. To this end, we
introduce MoELoRA to MLLMs which is effective to retain the previous
instruction alignment. Experimental results consistently illustrate the
forgetting decreased from this method on CoIN.
Related papers
- MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs [47.94710556156627]
MIA-Bench is a benchmark designed to evaluate multimodal large language models (MLLMs) on their ability to strictly adhere to complex instructions.
Our benchmark comprises a diverse set of 400 image-prompt pairs, each crafted to challenge the models' compliance with layered instructions.
arXiv Detail & Related papers (2024-07-01T17:53:35Z) - The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models [48.455388608863785]
We introduce a benchmark designed to evaluate models' abilities to follow multiple instructions through sequential instruction following tasks.
Our benchmark evaluates instruction following using four tasks (text modification, question answering, mathematics, and security rule following), each assessing different aspects of sequential instruction following.
Our evaluation of popular LLMs, both closed-source and open-source, shows that more recent and larger models significantly outperform their older and smaller counterparts on the SIFo tasks.
arXiv Detail & Related papers (2024-06-28T15:34:26Z) - Distilling Instruction-following Abilities of Large Language Models with Task-aware Curriculum Planning [12.651588927599441]
We introduce Task-Aware Curriculum Planning for Instruction Refinement (TAPIR)
TAPIR is a multi-round distillation framework with balanced task distributions and dynamic difficulty adjustment.
We rigorously evaluate TAPIR using two widely recognized benchmarks, including AlpacaEval 2.0 and MT-Bench.
arXiv Detail & Related papers (2024-05-22T08:38:26Z) - Continual Learning for Large Language Models: A Survey [95.79977915131145]
Large language models (LLMs) are not amenable to frequent re-training, due to high training costs arising from their massive scale.
This paper surveys recent works on continual learning for LLMs.
arXiv Detail & Related papers (2024-02-02T12:34:09Z) - Continual Instruction Tuning for Large Multimodal Models [30.438442723421556]
Multi-task joint instruction tuning can facilitate the model's continual learning ability and forgetting.
We propose task-similarity-informed regularization and model expansion methods for continual instruction tuning of LMMs.
arXiv Detail & Related papers (2023-11-27T15:04:48Z) - TRACE: A Comprehensive Benchmark for Continual Learning in Large
Language Models [52.734140807634624]
Aligned large language models (LLMs) demonstrate exceptional capabilities in task-solving, following instructions, and ensuring safety.
Existing continual learning benchmarks lack sufficient challenge for leading aligned LLMs.
We introduce TRACE, a novel benchmark designed to evaluate continual learning in LLMs.
arXiv Detail & Related papers (2023-10-10T16:38:49Z) - CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented
Dialog Systems [56.302581679816775]
This paper proposes Comprehensive Instruction (CINS) that exploits PLMs with task-specific instructions.
We design a schema (definition, constraint, prompt) of instructions and their customized realizations for three important downstream tasks in ToD.
Experiments are conducted on these ToD tasks in realistic few-shot learning scenarios with small validation data.
arXiv Detail & Related papers (2021-09-10T03:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.