Related papers: Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

URL: http://arxiv.org/abs/2603.05143v1
Date: Thu, 05 Mar 2026 13:12:46 GMT
Title: Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers
Authors: Ruichen Xu, Wenjing Yan, Ying-Jun Angela Zhang,
Abstract summary: We isolate analogical reasoning (inferring shared properties between entities based on known similarities) and analyze its emergence in transformers.<n>Joint training on similarity and attribution premises enables analogical reasoning through aligned representations.<n>Experiments with architectures up to 1.5B parameters validate our theory and demonstrate how representational geometry shapes inductive reasoning capabilities.
Score: 32.0329343786554
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding reasoning in large language models is complicated by evaluations that conflate multiple reasoning types. We isolate analogical reasoning (inferring shared properties between entities based on known similarities) and analyze its emergence in transformers. We theoretically prove three key results: (1) Joint training on similarity and attribution premises enables analogical reasoning through aligned representations; (2) Sequential training succeeds only when similarity structure is learned before specific attributes, revealing a necessary curriculum; (3) Two-hop reasoning ($a \to b, b \to c \implies a \to c$) reduces to analogical reasoning with identity bridges ($b = b$), which must appear explicitly in training data. These results reveal a unified mechanism: transformers encode entities with similar properties into similar representations, enabling property transfer through feature alignment. Experiments with architectures up to 1.5B parameters validate our theory and demonstrate how representational geometry shapes inductive reasoning capabilities.

Related papers

Emergent Analogical Reasoning in Transformers [46.10175943435167]
Despite its central role in cognition, the mechanisms by which Transformers acquire and implement analogical reasoning remain poorly understood.<n>We formalize analogical reasoning as the inference of correspondences between entities across categories.<n>We show that analogical reasoning in Transformers decomposes into two key components: geometric alignment of relational structure in the embedding space, and the application of a functor within the Transformer.
arXiv Detail & Related papers (2026-02-02T11:49:36Z)
Bridging Functional and Representational Similarity via Usable Information [3.9189279162842854]
We present a unified framework for quantifying the similarity between representations through the lens of textitusable information<n>First, addressing functional similarity, we establish a formal link between stitching performance and conditional mutual information.<n>Second, concerning representational similarity, we prove that reconstruction-based metrics and standard tools act as estimators of usable information under specific constraints.
arXiv Detail & Related papers (2026-01-29T11:30:55Z)
How do Transformers Learn Implicit Reasoning? [67.02072851088637]
We study how implicit multi-hop reasoning emerges by training transformers from scratch in a controlled symbolic environment.<n>We find that training with atomic triples is not necessary but accelerates learning, and that second-hop generalization relies on query-level exposure to specific compositional structures.
arXiv Detail & Related papers (2025-05-29T17:02:49Z)
Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers. We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models. Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z)
How Do Transformers Learn Topic Structure: Towards a Mechanistic Understanding [56.222097640468306]
We provide mechanistic understanding of how transformers learn "semantic structure" We show, through a combination of mathematical analysis and experiments on Wikipedia data, that the embedding layer and the self-attention layer encode the topical structure.
arXiv Detail & Related papers (2023-03-07T21:42:17Z)
Generalization-baed similarity [0.0]
We develop an abstract notion of similarity based on the observation that sets of generalizations encode important properties of elements.<n>We show that similarity defined in this way has appealing mathematical properties.<n>We sketch some potential applications to theoretical computer science and artificial intelligence.
arXiv Detail & Related papers (2023-02-13T14:48:59Z)
MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure [129.8481568648651]
We propose a benchmark to investigate models' logical reasoning capabilities in complex real-life scenarios. Based on the multi-hop chain of reasoning, the explanation form includes three main components. We evaluate the current best models' performance on this new explanation form.
arXiv Detail & Related papers (2022-10-22T16:01:13Z)
A Description Logic for Analogical Reasoning [28.259681405091666]
We present a mechanism to infer plausible missing knowledge, which relies on reasoning by analogy. This is the first paper that studies analog reasoning within the setting of description logic.
arXiv Detail & Related papers (2021-05-10T19:06:07Z)
Few-shot Visual Reasoning with Meta-analogical Contrastive Learning [141.2562447971]
We propose to solve a few-shot (or low-shot) visual reasoning problem, by resorting to analogical reasoning. We extract structural relationships between elements in both domains, and enforce them to be as similar as possible with analogical learning. We validate our method on RAVEN dataset, on which it outperforms state-of-the-art method, with larger gains when the training data is scarce.
arXiv Detail & Related papers (2020-07-23T14:00:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.