Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
- URL: http://arxiv.org/abs/2502.17282v1
- Date: Mon, 24 Feb 2025 16:10:53 GMT
- Title: Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
- Authors: Yi-Kai Zhang, De-Chuan Zhan, Han-Jia Ye,
- Abstract summary: Large Language Models (LLMs) have demonstrated human-like instruction-following abilities.<n>In this work, we explore how to route the best-performing LLM for each instruction to achieve better overall performance.<n>We develop a new paradigm, constructing capability instructions with model capability representation, user instruction, and performance inquiry prompts to assess the performance.
- Score: 64.38277118982698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large Language Models (LLMs) have demonstrated human-like instruction-following abilities, particularly those exceeding 100 billion parameters. The combined capability of some smaller, resource-friendly LLMs can address most of the instructions that larger LLMs excel at. In this work, we explore how to route the best-performing LLM for each instruction to achieve better overall performance. We develop a new paradigm, constructing capability instructions with model capability representation, user instruction, and performance inquiry prompts to assess the performance. To learn from capability instructions, we introduce a new end-to-end framework called Model Selection with Aptitude Test (Model-SAT), which generates positive and negative samples based on what different models perform well or struggle with. Model-SAT uses a model capability encoder that extends its model representation to a lightweight LLM. Our experiments show that Model-SAT understands the performance dimensions of candidate models and provides the probabilities of their capability to handle various instructions. Additionally, during deployment, a new model can quickly infer its aptitude test results across 50 tasks, each with 20 shots. Model-SAT performs state-of-the-art model routing without candidate inference and in real-world new model-released scenarios. The code is available at https://github.com/Now-Join-Us/CIT-LLM-Routing
Related papers
- EmbedLLM: Learning Compact Representations of Large Language Models [28.49433308281983]
We propose EmbedLLM, a framework designed to learn compact vector representations of Large Language Models.
We introduce an encoder-decoder approach for learning such embeddings, along with a systematic framework to evaluate their effectiveness.
Empirical results show that EmbedLLM outperforms prior methods in model routing both in accuracy and latency.
arXiv Detail & Related papers (2024-10-03T05:43:24Z) - Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities.
In-Context Learning (ICL) and.
Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting.
LLMs to downstream tasks.
We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z) - Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation [56.75665429851673]
This paper introduces a novel instruction curation algorithm, derived from two unique perspectives, human and LLM preference alignment.<n>Experiments demonstrate that we can maintain or even improve model performance by compressing synthetic multimodal instructions by up to 90%.
arXiv Detail & Related papers (2024-09-27T08:20:59Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - LLM Augmented LLMs: Expanding Capabilities through Composition [56.40953749310957]
CALM -- Composition to Augment Language Models -- introduces cross-attention between models to compose their representations and enable new capabilities.
We illustrate that augmenting PaLM2-S with a smaller model trained on low-resource languages results in an absolute improvement of up to 13% on tasks like translation into English.
When PaLM2-S is augmented with a code-specific model, we see a relative improvement of 40% over the base model for code generation and explanation tasks.
arXiv Detail & Related papers (2024-01-04T18:53:01Z) - Large Language Model Routing with Benchmark Datasets [40.42044096089315]
No single model typically achieves the best accuracy in all tasks and use cases.
We propose a new formulation for the problem, in which benchmark datasets are repurposed to learn a "router" model for this selection.
We show that this problem can be reduced to a collection of binary classification tasks.
arXiv Detail & Related papers (2023-09-27T17:08:40Z) - Prompt2Model: Generating Deployable Models from Natural Language
Instructions [74.19816829003729]
Large language models (LLMs) enable system builders to create competent NLP systems through prompting.
In other ways, LLMs are a step backward from traditional special-purpose NLP models.
We propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs.
arXiv Detail & Related papers (2023-08-23T17:28:21Z) - Evaluating Instruction-Tuned Large Language Models on Code Comprehension
and Generation [4.310519298899164]
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension and generation tasks.
For the zero-shot setting, instructed LLMs are very competitive on code comprehension and generation tasks.
For the few-shot setting, we find that adding demonstration examples substantially helps instructed LLMs perform better.
arXiv Detail & Related papers (2023-08-02T15:54:22Z) - Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance.
Our approach involves adapting the previous prompt-based representation method for autoregressive models.
By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z) - TART: A plug-and-play Transformer module for task-agnostic reasoning [38.84903599406189]
Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training.
Traditional adaptation approaches, such as fine-tuning, modify the underlying models for each specific task.
We propose TART which generically improves an LLM's reasoning abilities using a synthetically trained Transformer-based reasoning module.
arXiv Detail & Related papers (2023-06-13T04:37:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.