An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models
- URL: http://arxiv.org/abs/2404.03028v2
- Date: Wed, 10 Apr 2024 19:03:00 GMT
- Title: An Incomplete Loop: Deductive, Inductive, and Abductive Learning in Large Language Models
- Authors: Emmy Liu, Graham Neubig, Jacob Andreas,
- Abstract summary: Modern language models (LMs) can learn to perform new tasks in different ways.
In instruction following, the target task is described explicitly in natural language; in few-shot prompting, the task is specified implicitly.
In instruction inference, LMs are presented with in-context examples and are then prompted to generate a natural language task description.
- Score: 99.31449616860291
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Modern language models (LMs) can learn to perform new tasks in different ways: in instruction following, the target task is described explicitly in natural language; in few-shot prompting, the task is specified implicitly with a small number of examples; in instruction inference, LMs are presented with in-context examples and are then prompted to generate a natural language task description before making predictions. Each of these procedures may be thought of as invoking a different form of reasoning: instruction following involves deductive reasoning, few-shot prompting involves inductive reasoning, and instruction inference involves abductive reasoning. How do these different capabilities relate? Across four LMs (from the gpt and llama families) and two learning problems (involving arithmetic functions and machine translation) we find a strong dissociation between the different types of reasoning: LMs can sometimes learn effectively from few-shot prompts even when they are unable to explain their own prediction rules; conversely, they sometimes infer useful task descriptions while completely failing to learn from human-generated descriptions of the same task. Our results highlight the non-systematic nature of reasoning even in some of today's largest LMs, and underscore the fact that very different learning mechanisms may be invoked by seemingly similar prompting procedures.
Related papers
- Reasoning with Large Language Models, a Survey [2.831296564800826]
This paper reviews the rapidly expanding field of prompt-based reasoning with LLMs.
Our taxonomy identifies different ways to generate, evaluate, and control multi-step reasoning.
We find that self-improvement, self-reflection, and some meta abilities of the reasoning processes are possible through the judicious use of prompts.
arXiv Detail & Related papers (2024-07-16T08:49:35Z) - Towards a Mechanistic Interpretation of Multi-Step Reasoning
Capabilities of Language Models [107.07851578154242]
Language models (LMs) have strong multi-step (i.e., procedural) reasoning capabilities.
It is unclear whether LMs perform tasks by cheating with answers memorized from pretraining corpus, or, via a multi-step reasoning mechanism.
We show that MechanisticProbe is able to detect the information of the reasoning tree from the model's attentions for most examples.
arXiv Detail & Related papers (2023-10-23T01:47:29Z) - Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement [92.61557711360652]
Language models (LMs) often fall short on inductive reasoning, despite achieving impressive success on research benchmarks.
We conduct a systematic study of the inductive reasoning capabilities of LMs through iterative hypothesis refinement.
We reveal several discrepancies between the inductive reasoning processes of LMs and humans, shedding light on both the potentials and limitations of using LMs in inductive reasoning tasks.
arXiv Detail & Related papers (2023-10-12T17:51:10Z) - Large Language Models can Learn Rules [106.40747309894236]
We present Hypotheses-to-Theories (HtT), a framework that learns a rule library for reasoning with large language models (LLMs)
Experiments on relational reasoning, numerical reasoning and concept learning problems show that HtT improves existing prompting methods.
The learned rules are also transferable to different models and to different forms of the same problem.
arXiv Detail & Related papers (2023-10-10T23:07:01Z) - LINGO : Visually Debiasing Natural Language Instructions to Support Task
Diversity [11.44413929033824]
We develop LINGO, a novel visual analytics interface that supports an effective, task-driven workflow.
We conduct a user study with both novice and expert instruction creators, over a dataset of 1,616 linguistic tasks and their natural language instructions.
For both user groups, LINGO promotes the creation of more difficult tasks for pre-trained models, that contain higher linguistic diversity and lower instruction bias.
arXiv Detail & Related papers (2023-04-12T22:55:52Z) - Robustness of Learning from Task Instructions [15.462970803323563]
Traditional supervised learning mostly works on individual tasks and requires training on a large set of task-specific examples.
To build a system that can quickly and easily generalize to new tasks, task instructions have been adopted as an emerging trend of supervision.
This work investigates the system robustness when the instructions of new tasks are (i) manipulated, (ii) paraphrased, or (iii) from different levels of conciseness.
arXiv Detail & Related papers (2022-12-07T17:54:59Z) - Complementary Explanations for Effective In-Context Learning [77.83124315634386]
Large language models (LLMs) have exhibited remarkable capabilities in learning from explanations in prompts.
This work aims to better understand the mechanisms by which explanations are used for in-context learning.
arXiv Detail & Related papers (2022-11-25T04:40:47Z) - The Turking Test: Can Language Models Understand Instructions? [45.266428794559495]
We present the Turking Test, which examines a model's ability to follow natural language instructions of varying complexity.
Despite our lenient evaluation methodology, we observe that a large pretrained language model performs poorly across all tasks.
arXiv Detail & Related papers (2020-10-22T18:44:16Z) - oLMpics -- On what Language Model Pre-training Captures [84.60594612120173]
We propose eight reasoning tasks, which require operations such as comparison, conjunction, and composition.
A fundamental challenge is to understand whether the performance of a LM on a task should be attributed to the pre-trained representations or to the process of fine-tuning on the task data.
arXiv Detail & Related papers (2019-12-31T12:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.