Related papers: Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing

Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing

URL: http://arxiv.org/abs/2502.17282v1
Date: Mon, 24 Feb 2025 16:10:53 GMT
Title: Capability Instruction Tuning: A New Paradigm for Dynamic LLM Routing
Authors: Yi-Kai Zhang, De-Chuan Zhan, Han-Jia Ye,
Abstract summary: Large Language Models (LLMs) have demonstrated human-like instruction-following abilities.<n>In this work, we explore how to route the best-performing LLM for each instruction to achieve better overall performance.<n>We develop a new paradigm, constructing capability instructions with model capability representation, user instruction, and performance inquiry prompts to assess the performance.
Score: 64.38277118982698
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Models (LLMs) have demonstrated human-like instruction-following abilities, particularly those exceeding 100 billion parameters. The combined capability of some smaller, resource-friendly LLMs can address most of the instructions that larger LLMs excel at. In this work, we explore how to route the best-performing LLM for each instruction to achieve better overall performance. We develop a new paradigm, constructing capability instructions with model capability representation, user instruction, and performance inquiry prompts to assess the performance. To learn from capability instructions, we introduce a new end-to-end framework called Model Selection with Aptitude Test (Model-SAT), which generates positive and negative samples based on what different models perform well or struggle with. Model-SAT uses a model capability encoder that extends its model representation to a lightweight LLM. Our experiments show that Model-SAT understands the performance dimensions of candidate models and provides the probabilities of their capability to handle various instructions. Additionally, during deployment, a new model can quickly infer its aptitude test results across 50 tasks, each with 20 shots. Model-SAT performs state-of-the-art model routing without candidate inference and in real-world new model-released scenarios. The code is available at https://github.com/Now-Join-Us/CIT-LLM-Routing

Related papers

EmbedLLM: Learning Compact Representations of Large Language Models [28.49433308281983]
We propose EmbedLLM, a framework designed to learn compact vector representations of Large Language Models. We introduce an encoder-decoder approach for learning such embeddings, along with a systematic framework to evaluate their effectiveness. Empirical results show that EmbedLLM outperforms prior methods in model routing both in accuracy and latency.
arXiv Detail & Related papers (2024-10-03T05:43:24Z)
Reference Trustable Decoding: A Training-Free Augmentation Paradigm for Large Language Models [79.41139393080736]
Large language models (LLMs) have rapidly advanced and demonstrated impressive capabilities. In-Context Learning (ICL) and. Efficient Fine-Tuning (PEFT) are currently two mainstream methods for augmenting. LLMs to downstream tasks. We propose Reference Trustable Decoding (RTD), a paradigm that allows models to quickly adapt to new tasks without fine-tuning.
arXiv Detail & Related papers (2024-09-30T10:48:20Z)
Align$^2$LLaVA: Cascaded Human and Large Language Model Preference Alignment for Multi-modal Instruction Curation [56.75665429851673]
This paper introduces a novel instruction curation algorithm, derived from two unique perspectives, human and LLM preference alignment.<n>Experiments demonstrate that we can maintain or even improve model performance by compressing synthetic multimodal instructions by up to 90%.
arXiv Detail & Related papers (2024-09-27T08:20:59Z)
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts. We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM. We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z)
LLM Augmented LLMs: Expanding Capabilities through Composition [56.40953749310957]
CALM -- Composition to Augment Language Models -- introduces cross-attention between models to compose their representations and enable new capabilities. We illustrate that augmenting PaLM2-S with a smaller model trained on low-resource languages results in an absolute improvement of up to 13% on tasks like translation into English. When PaLM2-S is augmented with a code-specific model, we see a relative improvement of 40% over the base model for code generation and explanation tasks.
arXiv Detail & Related papers (2024-01-04T18:53:01Z)
Large Language Model Routing with Benchmark Datasets [40.42044096089315]
No single model typically achieves the best accuracy in all tasks and use cases. We propose a new formulation for the problem, in which benchmark datasets are repurposed to learn a "router" model for this selection. We show that this problem can be reduced to a collection of binary classification tasks.
arXiv Detail & Related papers (2023-09-27T17:08:40Z)
Prompt2Model: Generating Deployable Models from Natural Language Instructions [74.19816829003729]
Large language models (LLMs) enable system builders to create competent NLP systems through prompting. In other ways, LLMs are a step backward from traditional special-purpose NLP models. We propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs.
arXiv Detail & Related papers (2023-08-23T17:28:21Z)
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation [4.310519298899164]
In this work, we evaluate 10 open-source instructed LLMs on four representative code comprehension and generation tasks. For the zero-shot setting, instructed LLMs are very competitive on code comprehension and generation tasks. For the few-shot setting, we find that adding demonstration examples substantially helps instructed LLMs perform better.
arXiv Detail & Related papers (2023-08-02T15:54:22Z)
Scaling Sentence Embeddings with Large Language Models [43.19994568210206]
In this work, we propose an in-context learning-based method aimed at improving sentence embeddings performance. Our approach involves adapting the previous prompt-based representation method for autoregressive models. By scaling model size, we find scaling to more than tens of billion parameters harms the performance on semantic textual similarity tasks.
arXiv Detail & Related papers (2023-07-31T13:26:03Z)
TART: A plug-and-play Transformer module for task-agnostic reasoning [38.84903599406189]
Large language models (LLMs) exhibit in-context learning abilities which enable the same model to perform several tasks without any task-specific training. Traditional adaptation approaches, such as fine-tuning, modify the underlying models for each specific task. We propose TART which generically improves an LLM's reasoning abilities using a synthetically trained Transformer-based reasoning module.
arXiv Detail & Related papers (2023-06-13T04:37:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.