Prompt2Model: Generating Deployable Models from Natural Language
Instructions
- URL: http://arxiv.org/abs/2308.12261v1
- Date: Wed, 23 Aug 2023 17:28:21 GMT
- Title: Prompt2Model: Generating Deployable Models from Natural Language
Instructions
- Authors: Vijay Viswanathan, Chenyang Zhao, Amanda Bertsch, Tongshuang Wu,
Graham Neubig
- Abstract summary: Large language models (LLMs) enable system builders to create competent NLP systems through prompting.
In other ways, LLMs are a step backward from traditional special-purpose NLP models.
We propose Prompt2Model, a general-purpose method that takes a natural language task description like the prompts provided to LLMs.
- Score: 74.19816829003729
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) enable system builders today to create competent
NLP systems through prompting, where they only need to describe the task in
natural language and provide a few examples. However, in other ways, LLMs are a
step backward from traditional special-purpose NLP models; they require
extensive computational resources for deployment and can be gated behind APIs.
In this paper, we propose Prompt2Model, a general-purpose method that takes a
natural language task description like the prompts provided to LLMs, and uses
it to train a special-purpose model that is conducive to deployment. This is
done through a multi-step process of retrieval of existing datasets and
pretrained models, dataset generation using LLMs, and supervised fine-tuning on
these retrieved and generated datasets. Over three tasks, we demonstrate that
given the same few-shot prompt as input, Prompt2Model trains models that
outperform the results of a strong LLM, gpt-3.5-turbo, by an average of 20%
while being up to 700 times smaller. We also show that this data can be used to
obtain reliable performance estimates of model performance, enabling model
developers to assess model reliability before deployment. Prompt2Model is
available open-source at https://github.com/neulab/prompt2model.
Related papers
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - Large Language Models Can Automatically Engineer Features for Few-Shot Tabular Learning [35.03338699349037]
We propose a novel in-context learning framework, FeatLLM, which employs Large Language Models as feature engineers.
FeatLLM generates high-quality rules, significantly (10% on average) outperforming alternatives such as TabLLM and STUNT.
arXiv Detail & Related papers (2024-04-15T06:26:08Z) - LLM Augmented LLMs: Expanding Capabilities through Composition [56.40953749310957]
CALM -- Composition to Augment Language Models -- introduces cross-attention between models to compose their representations and enable new capabilities.
We illustrate that augmenting PaLM2-S with a smaller model trained on low-resource languages results in an absolute improvement of up to 13% on tasks like translation into English.
When PaLM2-S is augmented with a code-specific model, we see a relative improvement of 40% over the base model for code generation and explanation tasks.
arXiv Detail & Related papers (2024-01-04T18:53:01Z) - ArthModel: Enhance Arithmetic Skills to Large Language Model [0.0]
This work provides different ways of thinking, training and using a language model.
The codes and models will be released at urlhttps://www.eteced.com/eteced/arithmetic_finetuning_v1.
arXiv Detail & Related papers (2023-11-30T15:06:50Z) - LLM-Pruner: On the Structural Pruning of Large Language Models [65.02607075556742]
Large language models (LLMs) have shown remarkable capabilities in language understanding and generation.
We tackle the compression of LLMs within the bound of two constraints: being task-agnostic and minimizing the reliance on the original training dataset.
Our method, named LLM-Pruner, adopts structural pruning that selectively removes non-critical coupled structures.
arXiv Detail & Related papers (2023-05-19T12:10:53Z) - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages [116.74407069443895]
We unify encoder and decoder-based models into a single prefix-LM.
For learning methods, we explore the claim of a "free lunch" hypothesis.
For data distributions, the effect of a mixture distribution and multi-epoch training of programming and natural languages on model performance is explored.
arXiv Detail & Related papers (2023-05-03T17:55:25Z) - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators [98.11286353828525]
GPT-3.5 series models have demonstrated remarkable few-shot and zero-shot ability across various NLP tasks.
We propose AnnoLLM, which adopts a two-step approach, explain-then-annotate.
We build the first conversation-based information retrieval dataset employing AnnoLLM.
arXiv Detail & Related papers (2023-03-29T17:03:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.