Related papers: Emergent Analogical Reasoning in Transformers

Emergent Analogical Reasoning in Transformers

URL: http://arxiv.org/abs/2602.01992v3
Date: Tue, 10 Feb 2026 06:54:46 GMT
Title: Emergent Analogical Reasoning in Transformers
Authors: Gouki Minegishi, Jingyuan Feng, Hiroki Furuta, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo,
Abstract summary: Despite its central role in cognition, the mechanisms by which Transformers acquire and implement analogical reasoning remain poorly understood.<n>We formalize analogical reasoning as the inference of correspondences between entities across categories.<n>We show that analogical reasoning in Transformers decomposes into two key components: geometric alignment of relational structure in the embedding space, and the application of a functor within the Transformer.
Score: 46.10175943435167
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Analogy is a central faculty of human intelligence, enabling abstract patterns discovered in one domain to be applied to another. Despite its central role in cognition, the mechanisms by which Transformers acquire and implement analogical reasoning remain poorly understood. In this work, inspired by the notion of functors in category theory, we formalize analogical reasoning as the inference of correspondences between entities across categories. Based on this formulation, we introduce synthetic tasks that evaluate the emergence of analogical reasoning under controlled settings. We find that the emergence of analogical reasoning is highly sensitive to data characteristics, optimization choices, and model scale. Through mechanistic analysis, we show that analogical reasoning in Transformers decomposes into two key components: (1) geometric alignment of relational structure in the embedding space, and (2) the application of a functor within the Transformer. These mechanisms enable models to transfer relational structure from one category to another, realizing analogy. Finally, we quantify these effects and find that the same trends are observed in pretrained LLMs. In doing so, we move analogy from an abstract cognitive notion to a concrete, mechanistically grounded phenomenon in modern neural networks.

Related papers

Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers [32.0329343786554]
We isolate analogical reasoning (inferring shared properties between entities based on known similarities) and analyze its emergence in transformers.<n>Joint training on similarity and attribution premises enables analogical reasoning through aligned representations.<n>Experiments with architectures up to 1.5B parameters validate our theory and demonstrate how representational geometry shapes inductive reasoning capabilities.
arXiv Detail & Related papers (2026-03-05T13:12:46Z)
Generalizing Analogical Inference from Boolean to Continuous Domains [19.380448973444633]
Analogical reasoning is a powerful inductive mechanism, widely used in human cognition and increasingly applied in artificial intelligence.<n>We introduce a unified framework for analogical reasoning in real-valued domains based on parameterized analogies defined via generalized means.<n>Our results offer a general theory of analogical inference across discrete and continuous domains.
arXiv Detail & Related papers (2025-11-13T15:37:18Z)
Selective Induction Heads: How Transformers Select Causal Structures In Context [50.09964990342878]
We introduce a novel framework that showcases transformers' ability to handle causal structures.<n>Our framework varies the causal structure through interleaved Markov chains with different lags while keeping the transition probabilities fixed.<n>This setting unveils the formation of Selective Induction Heads, a new circuit that endows transformers with the ability to select the correct causal structure in-context.
arXiv Detail & Related papers (2025-09-09T23:13:41Z)
How do Transformers Learn Implicit Reasoning? [67.02072851088637]
We study how implicit multi-hop reasoning emerges by training transformers from scratch in a controlled symbolic environment.<n>We find that training with atomic triples is not necessary but accelerates learning, and that second-hop generalization relies on query-level exposure to specific compositional structures.
arXiv Detail & Related papers (2025-05-29T17:02:49Z)
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures [49.24097977047392]
We investigate two mainstream architectures for language modeling, namely Transformers and Mambas, to explore the extent of their mechanistic similarity. We propose to use Sparse Autoencoders (SAEs) to isolate interpretable features from these models and show that most features are similar in these two models.
arXiv Detail & Related papers (2024-10-09T08:28:53Z)
Understanding the Language Model to Solve the Symbolic Multi-Step Reasoning Problem from the Perspective of Buffer Mechanism [68.05754701230039]
We construct a symbolic multi-step reasoning task to investigate the information propagation mechanisms in Transformer models.<n>We propose a random matrix-based algorithm to enhance the model's reasoning ability.
arXiv Detail & Related papers (2024-05-24T07:41:26Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task [14.921790126851008]
We present a comprehensive mechanistic analysis of a transformer trained on a synthetic reasoning task. We identify a set of interpretable mechanisms the model uses to solve the task, and validate our findings using correlational and causal evidence.
arXiv Detail & Related papers (2024-02-19T08:04:25Z)
Beneath Surface Similarity: Large Language Models Make Reasonable Scientific Analogies after Structure Abduction [46.2032673640788]
The vital role of analogical reasoning in human cognition allows us to grasp novel concepts by linking them with familiar ones through shared relational structures. This work suggests that Large Language Models (LLMs) often overlook the structures that underpin these analogies. This paper introduces a task of analogical structure abduction, grounded in cognitive psychology, designed to abduce structures that form an analogy between two systems.
arXiv Detail & Related papers (2023-05-22T03:04:06Z)
Learning to See Analogies: A Connectionist Exploration [0.0]
This dissertation explores the integration of learning and analogy-making through the development of a computer program, called Analogator. By "seeing" many different analogy problems, along with possible solutions, Analogator gradually develops an ability to make new analogies.
arXiv Detail & Related papers (2020-01-18T14:06:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.