In-context Learning and Induction Heads
- URL: http://arxiv.org/abs/2209.11895v1
- Date: Sat, 24 Sep 2022 00:43:19 GMT
- Title: In-context Learning and Induction Heads
- Authors: Catherine Olsson, Nelson Elhage, Neel Nanda, Nicholas Joseph, Nova
DasSarma, Tom Henighan, Ben Mann, Amanda Askell, Yuntao Bai, Anna Chen, Tom
Conerly, Dawn Drain, Deep Ganguli, Zac Hatfield-Dodds, Danny Hernandez, Scott
Johnston, Andy Jones, Jackson Kernion, Liane Lovitt, Kamal Ndousse, Dario
Amodei, Tom Brown, Jack Clark, Jared Kaplan, Sam McCandlish, Chris Olah
- Abstract summary: "Induction heads" are attention heads that implement a simple algorithm to complete token sequences.
We find that induction heads develop at precisely the same point as a sudden sharp increase in in-context learning ability.
- Score: 5.123049926855312
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: "Induction heads" are attention heads that implement a simple algorithm to
complete token sequences like [A][B] ... [A] -> [B]. In this work, we present
preliminary and indirect evidence for a hypothesis that induction heads might
constitute the mechanism for the majority of all "in-context learning" in large
transformer models (i.e. decreasing loss at increasing token indices). We find
that induction heads develop at precisely the same point as a sudden sharp
increase in in-context learning ability, visible as a bump in the training
loss. We present six complementary lines of evidence, arguing that induction
heads may be the mechanistic source of general in-context learning in
transformer models of any size. For small attention-only models, we present
strong, causal evidence; for larger models with MLPs, we present correlational
evidence.
Related papers
- From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers [67.02076505996284]
We study how the choice of pretraining data distribution steers a shallow transformer toward one behavior or the other.<n>Our results shed light on the algorithmic biases of pretrained transformers and offer conceptual guidelines for data-driven control of their learned behaviors.
arXiv Detail & Related papers (2025-12-21T08:10:26Z) - In-Context Learning Without Copying [31.718993147344353]
We study whether transformers can still acquire in-context learning capabilities when inductive copying is suppressed.<n>We propose Hapax, a setting where we omit the loss contribution of any token that can be correctly predicted by induction heads.<n>Mechanistic analysis shows that models trained with Hapax develop fewer and weaker induction heads but still preserve ICL capabilities.
arXiv Detail & Related papers (2025-11-07T22:11:11Z) - On the Emergence of Induction Heads for In-Context Learning [121.64612469118464]
We study the emergence of induction heads, a previously identified mechanism in two-layer transformers.<n>We explain the origin of this structure using a minimal ICL task formulation and a modified transformer architecture.
arXiv Detail & Related papers (2025-11-02T18:12:06Z) - Induction Head Toxicity Mechanistically Explains Repetition Curse in Large Language Models [24.666925550391024]
We identify induction heads as a key driver of the repetition curse.<n>We propose a technique with attention head regularization that could be employed to reduce the dominance of induction heads during generation.
arXiv Detail & Related papers (2025-05-17T03:09:33Z) - The Dual-Route Model of Induction [19.752542337008773]
We introduce concept-level induction heads, which copy entire lexical units instead of individual tokens.
We show that concept induction heads are responsible for semantic tasks like word-level translation, whereas token induction heads are vital for tasks that can only be done verbatim.
arXiv Detail & Related papers (2025-04-03T20:40:31Z) - In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention [52.159541540613915]
We study how multi-head softmax attention models are trained to perform in-context learning on linear data.
Our results reveal that in-context learning ability emerges from the trained transformer as an aggregated effect of its architecture and the underlying data distribution.
arXiv Detail & Related papers (2025-03-17T02:00:49Z) - Rethinking Associative Memory Mechanism in Induction Head [37.93644115914534]
This paper investigates how a two-layer transformer thoroughly captures in-context information and balances it with pretrained bigram knowledge in next token prediction.<n>We theoretically analyze the representation of weight matrices in attention layers and the resulting logits when a transformer is given prompts generated by a bigram model.
arXiv Detail & Related papers (2024-12-16T05:33:05Z) - KV Shifting Attention Enhances Language Modeling [10.265219156828907]
Current large language models are mainly based on decode-only structure transformers, which have great in-context learning capabilities.
We propose a KV shifting attention to more efficiently implement the ability of the model's induction.
Our experimental results demonstrate that KV shifting attention is beneficial to learning induction heads and language modeling.
arXiv Detail & Related papers (2024-11-29T09:42:38Z) - Interpretable Language Modeling via Induction-head Ngram Models [74.26720927767398]
We propose Induction-head ngram models (Induction-Gram) to bolster modern ngram models with a hand-engineered "induction head"
This induction head uses a custom neural similarity metric to efficiently search the model's input context for potential next-word completions.
Experiments show that this simple method significantly improves next-word prediction over baseline interpretable models.
arXiv Detail & Related papers (2024-10-31T12:33:26Z) - Toward Understanding In-context vs. In-weight Learning [50.24035812301655]
We identify simplified distributional properties that give rise to the emergence and disappearance of in-context learning.
We then extend the study to a full large language model, showing how fine-tuning on various collections of natural language prompts can elicit similar in-context and in-weight learning behaviour.
arXiv Detail & Related papers (2024-10-30T14:09:00Z) - On the Inductive Bias of Stacking Towards Improving Reasoning [50.225873619537765]
We propose a variant of gradual stacking called MIDAS that can speed up language model training by up to 40%.
MIDAS is not only training-efficient but surprisingly also has an inductive bias towards improving downstream tasks.
We conjecture the underlying reason for this inductive bias by exploring the connection of stacking to looped models.
arXiv Detail & Related papers (2024-09-27T17:58:21Z) - Graph Stochastic Neural Process for Inductive Few-shot Knowledge Graph Completion [63.68647582680998]
We focus on a task called inductive few-shot knowledge graph completion (I-FKGC)
Inspired by the idea of inductive reasoning, we cast I-FKGC as an inductive reasoning problem.
We present a neural process-based hypothesis extractor that models the joint distribution of hypothesis, from which we can sample a hypothesis for predictions.
In the second module, based on the hypothesis, we propose a graph attention-based predictor to test if the triple in the query set aligns with the extracted hypothesis.
arXiv Detail & Related papers (2024-08-03T13:37:40Z) - Linking In-context Learning in Transformers to Human Episodic Memory [1.124958340749622]
We focus on induction heads, which contribute to in-context learning in Transformer-based large language models.
We demonstrate that induction heads are behaviorally, functionally, and mechanistically similar to the contextual maintenance and retrieval model of human episodic memory.
arXiv Detail & Related papers (2024-05-23T18:51:47Z) - The twin peaks of learning neural networks [3.382017614888546]
Recent works demonstrated the existence of a double-descent phenomenon for the generalization error of neural networks.
We explore a link between this phenomenon and the increase of complexity and sensitivity of the function represented by neural networks.
arXiv Detail & Related papers (2024-01-23T10:09:14Z) - Towards Few-shot Inductive Link Prediction on Knowledge Graphs: A
Relational Anonymous Walk-guided Neural Process Approach [49.00753238429618]
Few-shot inductive link prediction on knowledge graphs aims to predict missing links for unseen entities with few-shot links observed.
Recent inductive methods utilize the sub-graphs around unseen entities to obtain the semantics and predict links inductively.
We propose a novel relational anonymous walk-guided neural process for few-shot inductive link prediction on knowledge graphs, denoted as RawNP.
arXiv Detail & Related papers (2023-06-26T12:02:32Z) - Beyond Transformers for Function Learning [0.6768558752130311]
The ability to learn and predict simple functions is a key aspect of human intelligence.
Recent works have started to explore this ability using transformer architectures.
We propose to address this gap by augmenting the transformer architecture with two simple inductive learning biases.
arXiv Detail & Related papers (2023-04-19T21:33:06Z) - Interpretability in the Wild: a Circuit for Indirect Object
Identification in GPT-2 small [68.879023473838]
We present an explanation for how GPT-2 small performs a natural language task called indirect object identification (IOI)
To our knowledge, this investigation is the largest end-to-end attempt at reverse-engineering a natural behavior "in the wild" in a language model.
arXiv Detail & Related papers (2022-11-01T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.