Related papers: Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees

Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees

URL: http://arxiv.org/abs/2512.01870v1
Date: Mon, 01 Dec 2025 16:51:38 GMT
Title: Testing Transformer Learnability on the Arithmetic Sequence of Rooted Trees
Authors: Alessandro Breccia, Federica Gerace, Marco Lippi, Gabriele Sicuro, Pierluigi Contucci,
Abstract summary: We study whether a Large Language Model can learn the deterministic sequence of trees generated by the iterated prime factorization of the natural numbers.<n>Our results show that the model partially learns the internal grammar of $mathbbNmathcalT$, capturing non-trivial regularities and correlations.
Score: 41.17969667763904
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study whether a Large Language Model can learn the deterministic sequence of trees generated by the iterated prime factorization of the natural numbers. Each integer is mapped into a rooted planar tree and the resulting sequence $ \mathbb{N}\mathcal{T}$ defines an arithmetic text with measurable statistical structure. A transformer network (the GPT-2 architecture) is trained from scratch on the first $10^{11}$ elements to subsequently test its predictive ability under next-word and masked-word prediction tasks. Our results show that the model partially learns the internal grammar of $\mathbb{N}\mathcal{T}$, capturing non-trivial regularities and correlations. This suggests that learnability may extend beyond empirical data to the very structure of arithmetic.

Related papers

Beyond Softmax: A Natural Parameterization for Categorical Random Variables [61.709831225296305]
We introduce the $textitcatnat$ function, a function composed of a sequence of hierarchical binary splits.<n>A rich set of experiments show that the proposed function improves the learning efficiency and yields models characterized by consistently higher test performance.
arXiv Detail & Related papers (2025-09-29T12:55:50Z)
Primender Sequence: A Novel Mathematical Construct for Testing Symbolic Inference and AI Reasoning [0.0]
Primender sequence is a novel integer sequence that combines classical primality with modular digit-based conditions.<n>We propose the sequence as a benchmark for evaluating the symbolic reasoning capabilities of Large Language Models.
arXiv Detail & Related papers (2025-06-12T11:21:58Z)
Prime Convolutional Model: Breaking the Ground for Theoretical Explainability [45.07003937279752]
We propose a new theoretical approach to Explainable AI.<n>We apply the method to a case study created in a controlled environment.<n>We show that the different behaviors of p-Conv can be modeled mathematically in terms of $m$ and $B$.
arXiv Detail & Related papers (2025-03-04T16:42:46Z)
(How) Can Transformers Predict Pseudo-Random Numbers? [7.201095605457193]
We study the ability of Transformers to learn pseudo-random number sequences from linear congruential generators (LCGs)<n>We find that Transformers can perform in-context prediction of LCG sequences with unseen moduli ($m$) and parameters ($a,c$)<n>We also show that Transformers can generalize to unseen moduli up to $m_texttest = 216$.
arXiv Detail & Related papers (2025-02-14T18:59:40Z)
TAMER: Tree-Aware Transformer for Handwritten Mathematical Expression Recognition [17.855238221599635]
We propose a novel model named TAMER (Tree-Aware Transformer) for handwritten mathematical expression recognition.<n> TAMER combines the advantages of both sequence decoding and tree decoding models by jointly optimizing sequence prediction and tree structure prediction tasks.<n> Experimental results on CROHME datasets demonstrate that TAMER outperforms traditional sequence decoding models.
arXiv Detail & Related papers (2024-08-16T07:24:19Z)
Tree-Based Representation and Generation of Natural and Mathematical Language [77.34726150561087]
Mathematical language in scientific communications and educational scenarios is important yet relatively understudied. Recent works on mathematical language focus either on representing stand-alone mathematical expressions, or mathematical reasoning in pre-trained natural language models. We propose a series of modifications to existing language models to jointly represent and generate text and math.
arXiv Detail & Related papers (2023-02-15T22:38:34Z)
Characterizing Intrinsic Compositionality in Transformers with Tree Projections [72.45375959893218]
neural models like transformers can route information arbitrarily between different parts of their input. We show that transformers for three different tasks become more treelike over the course of training. These trees are predictive of model behavior, with more tree-like models generalizing better on tests of compositional generalization.
arXiv Detail & Related papers (2022-11-02T17:10:07Z)
Inducing Transformer's Compositional Generalization Ability via Auxiliary Sequence Prediction Tasks [86.10875837475783]
Systematic compositionality is an essential mechanism in human language, allowing the recombination of known parts to create novel expressions. Existing neural models have been shown to lack this basic ability in learning symbolic structures. We propose two auxiliary sequence prediction tasks that track the progress of function and argument semantics.
arXiv Detail & Related papers (2021-09-30T16:41:19Z)
Recursive Top-Down Production for Sentence Generation with Latent Trees [77.56794870399288]
We model the production property of context-free grammars for natural and synthetic languages. We present a dynamic programming algorithm that marginalises over latent binary tree structures with $N$ leaves. We also present experimental results on German-English translation on the Multi30k dataset.
arXiv Detail & Related papers (2020-10-09T17:47:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.