Learning with the $p$-adics
- URL: http://arxiv.org/abs/2512.22692v1
- Date: Sat, 27 Dec 2025 19:40:42 GMT
- Title: Learning with the $p$-adics
- Authors: André F. T. Martins,
- Abstract summary: We study the suitability of a radically different field as an alternative to $mathbbR$ -- the ultrametric and non-archimedean space of $p$-adic numbers, $mathbbQ_p$.<n>The hierarchical structure of the $p$-adics and their interpretation as infinite strings make them an appealing tool for code theory and hierarchical representation learning.
- Score: 26.431600220740354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing machine learning frameworks operate over the field of real numbers ($\mathbb{R}$) and learn representations in real (Euclidean or Hilbert) vector spaces (e.g., $\mathbb{R}^d$). Their underlying geometric properties align well with intuitive concepts such as linear separability, minimum enclosing balls, and subspace projection; and basic calculus provides a toolbox for learning through gradient-based optimization. But is this the only possible choice? In this paper, we study the suitability of a radically different field as an alternative to $\mathbb{R}$ -- the ultrametric and non-archimedean space of $p$-adic numbers, $\mathbb{Q}_p$. The hierarchical structure of the $p$-adics and their interpretation as infinite strings make them an appealing tool for code theory and hierarchical representation learning. Our exploratory theoretical work establishes the building blocks for classification, regression, and representation learning with the $p$-adics, providing learning models and algorithms. We illustrate how simple Quillian semantic networks can be represented as a compact $p$-adic linear network, a construction which is not possible with the field of reals. We finish by discussing open problems and opportunities for future research enabled by this new framework.
Related papers
- Neural Networks Learn Generic Multi-Index Models Near Information-Theoretic Limit [66.20349460098275]
We study the gradient descent learning of a general Gaussian Multi-index model $f(boldsymbolx)=g(boldsymbolUboldsymbolx)$ with hidden subspace $boldsymbolUin mathbbRrtimes d$.<n>We prove that under generic non-degenerate assumptions on the link function, a standard two-layer neural network trained via layer-wise gradient descent can agnostically learn the target with $o_d(1)$ test error.
arXiv Detail & Related papers (2025-11-19T04:46:47Z) - v-PuNNs: van der Put Neural Networks for Transparent Ultrametric Representation Learning [0.0]
We introduce van der Put Neural Networks (v-PuNNs), the first architecture whose neurons are characteristic functions of p-adic balls in $mathbbZ_p$.<n>Under our Transparent Ultrametric Representation Learning (TURL) principle every weight is itself a p-adic number, giving exact subtree semantics.<n>V-PuNNs therefore bridge number theory and deep learning, offering exact, interpretable, and efficient models for hierarchical data.
arXiv Detail & Related papers (2025-08-01T18:23:38Z) - Alpay Algebra: A Universal Structural Foundation [0.0]
Alpay Algebra is introduced as a universal, category-theoretic framework.<n>It unifies classical algebraic structures with modern needs in symbolic recursion and explainable AI.
arXiv Detail & Related papers (2025-05-21T10:18:49Z) - $p$-Adic Polynomial Regression as Alternative to Neural Network for Approximating $p$-Adic Functions of Many Variables [55.2480439325792]
A regression model is constructed that allows approximating continuous functions with any degree of accuracy.<n>The proposed model can be considered as a simple alternative to possible $p$-adic models based on neural network architecture.
arXiv Detail & Related papers (2025-03-30T15:42:08Z) - Geometry of fibers of the multiplication map of deep linear neural networks [0.0]
We study the geometry of the set of quivers of composable matrices which multiply to a fixed matrix.<n>Our solution is presented in three forms: a Poincar'e series in equivariant cohomology, a quadratic integer program, and an explicit formula.
arXiv Detail & Related papers (2024-11-29T18:36:03Z) - Learning Hierarchical Polynomials with Three-Layer Neural Networks [56.71223169861528]
We study the problem of learning hierarchical functions over the standard Gaussian distribution with three-layer neural networks.
For a large subclass of degree $k$s $p$, a three-layer neural network trained via layerwise gradientp descent on the square loss learns the target $h$ up to vanishing test error.
This work demonstrates the ability of three-layer neural networks to learn complex features and as a result, learn a broad class of hierarchical functions.
arXiv Detail & Related papers (2023-11-23T02:19:32Z) - An Approximation Theory for Metric Space-Valued Functions With A View
Towards Deep Learning [25.25903127886586]
We build universal functions approximators of continuous maps between arbitrary Polish metric spaces $mathcalX$ and $mathcalY$.
In particular, we show that the required number of Dirac measures is determined by the structure of $mathcalX$ and $mathcalY$.
arXiv Detail & Related papers (2023-04-24T16:18:22Z) - There is no Accuracy-Interpretability Tradeoff in Reinforcement Learning
for Mazes [64.05903267230467]
Interpretability is an essential building block for trustworthiness in reinforcement learning systems.
We show that in certain cases, one can achieve policy interpretability while maintaining its optimality.
arXiv Detail & Related papers (2022-06-09T04:23:26Z) - AlgebraNets [35.311476442807766]
We study alternative algebras as number representations using the enwiki8 and WikiText-103 datasets.
We consider $mathbbC$, $mathbbH$, $M_2(mathbbR)$, $M_3(mathbbR)$ and $M_4(mathbbR)$.
multiplication in these algebras has higher compute density than real multiplication.
arXiv Detail & Related papers (2020-06-12T17:51:20Z) - Stochastic Flows and Geometric Optimization on the Orthogonal Group [52.50121190744979]
We present a new class of geometrically-driven optimization algorithms on the orthogonal group $O(d)$.
We show that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, flows and metric learning.
arXiv Detail & Related papers (2020-03-30T15:37:50Z) - Backward Feature Correction: How Deep Learning Performs Deep
(Hierarchical) Learning [66.05472746340142]
This paper analyzes how multi-layer neural networks can perform hierarchical learning _efficiently_ and _automatically_ by SGD on the training objective.
We establish a new principle called "backward feature correction", where the errors in the lower-level features can be automatically corrected when training together with the higher-level layers.
arXiv Detail & Related papers (2020-01-13T17:28:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.