Functionality learning through specification instructions
- URL: http://arxiv.org/abs/2311.08481v2
- Date: Wed, 09 Oct 2024 11:54:38 GMT
- Title: Functionality learning through specification instructions
- Authors: Pedro Henrique Luz de Araujo, Benjamin Roth,
- Abstract summary: Test suites assess natural language processing models' performance on specific functionalities.
This paper introduces specification instructions: text descriptions specifying fine-grained task-specific behaviors.
We combine the specification instructions to create specification-augmented prompts, which we feed to language models pre-trained on natural instruction data.
- Score: 2.4095382017500464
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Test suites assess natural language processing models' performance on specific functionalities: cases of interest involving model robustness, fairness, or particular linguistic capabilities. This paper introduces specification instructions: text descriptions specifying fine-grained task-specific behaviors. For each functionality in a suite, we generate an instruction that describes it. We combine the specification instructions to create specification-augmented prompts, which we feed to language models pre-trained on natural instruction data. We conduct experiments to measure how optimizing for some functionalities may negatively impact functionalities that are not covered by the specification set. Our analyses across four tasks and models of diverse sizes and families show that smaller models struggle to follow specification instructions. However, larger models (>~3B params.) can benefit from specifications and -- surprisingly -- even generalize certain desirable behaviors across functionalities.
Related papers
- Evaluating the Instruction-following Abilities of Language Models using Knowledge Tasks [4.945902994386117]
We focus on developing a benchmark for instruction-following where it is easy to verify both task performance as well as instruction-following capabilities.
We adapt existing knowledge benchmarks and augment them with instructions that are a) conditional on correctly answering the knowledge task or b) use the space of candidate options in multiple-choice knowledge-answering tasks.
We find that even large-scale instruction-tuned LLMs fail to follow simple instructions in zero-shot settings.
arXiv Detail & Related papers (2024-10-16T19:07:37Z) - Eliciting Instruction-tuned Code Language Models' Capabilities to Utilize Auxiliary Function for Code Generation [25.434546255499242]
We study the code generation behavior of instruction-tuned models built on top of code pre-trained language models.
We design several ways to provide auxiliary functions to the models by adding them to the query or providing a response prefix.
arXiv Detail & Related papers (2024-09-20T22:28:20Z) - Third-Party Language Model Performance Prediction from Instruction [59.574169249307054]
Language model-based instruction-following systems have lately shown increasing performance on many benchmark tasks.
A user may easily prompt a model with an instruction without any idea of whether the responses should be expected to be accurate.
We propose a third party performance prediction framework, where a separate model is trained to predict the metric resulting from evaluating an instruction-following system on a task.
arXiv Detail & Related papers (2024-03-19T03:53:47Z) - Specialist or Generalist? Instruction Tuning for Specific NLP Tasks [58.422495509760154]
We investigate whether incorporating broad-coverage generalist instruction tuning can contribute to building a specialist model.
Our experiments assess four target tasks with distinct coverage levels.
The effect is particularly pronounced when the amount of task-specific training data is limited.
arXiv Detail & Related papers (2023-10-23T19:46:48Z) - UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions [64.50935101415776]
We build a single model that jointly performs various spoken language understanding (SLU) tasks.
We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages.
arXiv Detail & Related papers (2023-10-04T17:10:23Z) - Did You Read the Instructions? Rethinking the Effectiveness of Task
Definitions in Instruction Learning [74.70157466822612]
We systematically study the role of task definitions in instruction learning.
We find that model performance drops substantially when removing contents describing the task output.
We propose two strategies to help models better leverage task instructions.
arXiv Detail & Related papers (2023-06-01T21:11:24Z) - Instruction Induction: From Few Examples to Natural Language Task
Descriptions [55.139554327372934]
We show that language models can explicitly infer an underlying task from a few demonstrations by prompting them to generate a natural language instruction that fits the examples.
InstructGPT achieves 65.7% of human performance in our execution-based metric, while the original GPT-3 model reaches only 9.8% of human performance.
arXiv Detail & Related papers (2022-05-22T09:22:37Z) - Multi Task Learning For Zero Shot Performance Prediction of Multilingual
Models [12.759281077118567]
Massively Multilingual Transformer based Language Models have been observed to be surprisingly effective on zero-shot transfer across languages.
We build upon some of the existing techniques for predicting the zero-shot performance on a task, by modeling it as a multi-task learning problem.
arXiv Detail & Related papers (2022-05-12T14:47:03Z) - Quantifying Adaptability in Pre-trained Language Models with 500 Tasks [60.0364822929442]
We present a large-scale empirical study of the features and limits of LM adaptability using a new benchmark, TaskBench500.
We evaluate three facets of adaptability, finding that adaptation procedures differ dramatically in their ability to memorize small datasets.
Our experiments show that adaptability to new tasks, like generalization to new examples, can be systematically described and understood.
arXiv Detail & Related papers (2021-12-06T18:00:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.