LINGO : Visually Debiasing Natural Language Instructions to Support Task
Diversity
- URL: http://arxiv.org/abs/2304.06184v1
- Date: Wed, 12 Apr 2023 22:55:52 GMT
- Title: LINGO : Visually Debiasing Natural Language Instructions to Support Task
Diversity
- Authors: Anjana Arunkumar, Shubham Sharma, Rakhi Agrawal, Sriram
Chandrasekaran, Chris Bryan
- Abstract summary: We develop LINGO, a novel visual analytics interface that supports an effective, task-driven workflow.
We conduct a user study with both novice and expert instruction creators, over a dataset of 1,616 linguistic tasks and their natural language instructions.
For both user groups, LINGO promotes the creation of more difficult tasks for pre-trained models, that contain higher linguistic diversity and lower instruction bias.
- Score: 11.44413929033824
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Cross-task generalization is a significant outcome that defines mastery in
natural language understanding. Humans show a remarkable aptitude for this, and
can solve many different types of tasks, given definitions in the form of
textual instructions and a small set of examples. Recent work with pre-trained
language models mimics this learning style: users can define and exemplify a
task for the model to attempt as a series of natural language prompts or
instructions. While prompting approaches have led to higher cross-task
generalization compared to traditional supervised learning, analyzing 'bias' in
the task instructions given to the model is a difficult problem, and has thus
been relatively unexplored. For instance, are we truly modeling a task, or are
we modeling a user's instructions? To help investigate this, we develop LINGO,
a novel visual analytics interface that supports an effective, task-driven
workflow to (1) help identify bias in natural language task instructions, (2)
alter (or create) task instructions to reduce bias, and (3) evaluate
pre-trained model performance on debiased task instructions. To robustly
evaluate LINGO, we conduct a user study with both novice and expert instruction
creators, over a dataset of 1,616 linguistic tasks and their natural language
instructions, spanning 55 different languages. For both user groups, LINGO
promotes the creation of more difficult tasks for pre-trained models, that
contain higher linguistic diversity and lower instruction bias. We additionally
discuss how the insights learned in developing and evaluating LINGO can aid in
the design of future dashboards that aim to minimize the effort involved in
prompt creation across multiple domains.
Related papers
- An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models [99.31449616860291]
Modern language models (LMs) can learn to perform new tasks in different ways.
In instruction following, the target task is described explicitly in natural language; in few-shot prompting, the task is specified implicitly.
In instruction inference, LMs are presented with in-context examples and are then prompted to generate a natural language task description.
arXiv Detail & Related papers (2024-04-03T19:31:56Z) - UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions [64.50935101415776]
We build a single model that jointly performs various spoken language understanding (SLU) tasks.
We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages.
arXiv Detail & Related papers (2023-10-04T17:10:23Z) - Did You Read the Instructions? Rethinking the Effectiveness of Task
Definitions in Instruction Learning [74.70157466822612]
We systematically study the role of task definitions in instruction learning.
We find that model performance drops substantially when removing contents describing the task output.
We propose two strategies to help models better leverage task instructions.
arXiv Detail & Related papers (2023-06-01T21:11:24Z) - Large Language Model Instruction Following: A Survey of Progresses and Challenges [15.94137745420097]
This paper tries to summarize and provide insights to the current research on instruction following.
To our knowledge, this is the first comprehensive survey about instruction following.
arXiv Detail & Related papers (2023-03-18T19:17:47Z) - Robustness of Learning from Task Instructions [15.462970803323563]
Traditional supervised learning mostly works on individual tasks and requires training on a large set of task-specific examples.
To build a system that can quickly and easily generalize to new tasks, task instructions have been adopted as an emerging trend of supervision.
This work investigates the system robustness when the instructions of new tasks are (i) manipulated, (ii) paraphrased, or (iii) from different levels of conciseness.
arXiv Detail & Related papers (2022-12-07T17:54:59Z) - Instruction Induction: From Few Examples to Natural Language Task
Descriptions [55.139554327372934]
We show that language models can explicitly infer an underlying task from a few demonstrations by prompting them to generate a natural language instruction that fits the examples.
InstructGPT achieves 65.7% of human performance in our execution-based metric, while the original GPT-3 model reaches only 9.8% of human performance.
arXiv Detail & Related papers (2022-05-22T09:22:37Z) - Analyzing the Limits of Self-Supervision in Handling Bias in Language [52.26068057260399]
We evaluate how well language models capture the semantics of four tasks for bias: diagnosis, identification, extraction and rephrasing.
Our analyses indicate that language models are capable of performing these tasks to widely varying degrees across different bias dimensions, such as gender and political affiliation.
arXiv Detail & Related papers (2021-12-16T05:36:08Z) - Ask Your Humans: Using Human Instructions to Improve Generalization in
Reinforcement Learning [32.82030512053361]
We propose the use of step-by-step human demonstrations in the form of natural language instructions and action trajectories.
We find that human demonstrations help solve the most complex tasks.
We also find that incorporating natural language allows the model to generalize to unseen tasks in a zero-shot setting.
arXiv Detail & Related papers (2020-11-01T14:39:46Z) - The Turking Test: Can Language Models Understand Instructions? [45.266428794559495]
We present the Turking Test, which examines a model's ability to follow natural language instructions of varying complexity.
Despite our lenient evaluation methodology, we observe that a large pretrained language model performs poorly across all tasks.
arXiv Detail & Related papers (2020-10-22T18:44:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.