Related papers: v-PuNNs: van der Put Neural Networks for Transparent Ultrametric Representation Learning

v-PuNNs: van der Put Neural Networks for Transparent Ultrametric Representation Learning

URL: http://arxiv.org/abs/2508.01010v1
Date: Fri, 01 Aug 2025 18:23:38 GMT
Title: v-PuNNs: van der Put Neural Networks for Transparent Ultrametric Representation Learning
Authors: Gnankan Landry Regis N'guessan,
Abstract summary: We introduce van der Put Neural Networks (v-PuNNs), the first architecture whose neurons are characteristic functions of p-adic balls in $mathbbZ_p$.<n>Under our Transparent Ultrametric Representation Learning (TURL) principle every weight is itself a p-adic number, giving exact subtree semantics.<n>V-PuNNs therefore bridge number theory and deep learning, offering exact, interpretable, and efficient models for hierarchical data.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Conventional deep learning models embed data in Euclidean space $\mathbb{R}^d$, a poor fit for strictly hierarchical objects such as taxa, word senses, or file systems. We introduce van der Put Neural Networks (v-PuNNs), the first architecture whose neurons are characteristic functions of p-adic balls in $\mathbb{Z}_p$. Under our Transparent Ultrametric Representation Learning (TURL) principle every weight is itself a p-adic number, giving exact subtree semantics. A new Finite Hierarchical Approximation Theorem shows that a depth-K v-PuNN with $\sum_{j=0}^{K-1}p^{\,j}$ neurons universally represents any K-level tree. Because gradients vanish in this discrete space, we propose Valuation-Adaptive Perturbation Optimization (VAPO), with a fast deterministic variant (HiPaN-DS) and a moment-based one (HiPaN / Adam-VAPO). On three canonical benchmarks our CPU-only implementation sets new state-of-the-art: WordNet nouns (52,427 leaves) 99.96% leaf accuracy in 16 min; GO molecular-function 96.9% leaf / 100% root in 50 s; NCBI Mammalia Spearman $\rho = -0.96$ with true taxonomic distance. The learned metric is perfectly ultrametric (zero triangle violations), and its fractal and information-theoretic properties are analyzed. Beyond classification we derive structural invariants for quantum systems (HiPaQ) and controllable generative codes for tabular data (Tab-HiPaN). v-PuNNs therefore bridge number theory and deep learning, offering exact, interpretable, and efficient models for hierarchical data.

Related papers

Biology-inspired joint distribution neurons based on Hierarchical Correlation Reconstruction allowing for multidirectional neural networks [0.49728186750345144]
There are proposed novel artificial neurons based on HCR (Arnold Correlation Reconstruction) allowing to remove low level differences.<n>Such HCR network can also propagate probability distributions (also joint) like $rho(y,z|x)$.<n>It also allows for additional training approaches, like direct $(a_mathbfj)$ estimation, through tensor decomposition.
arXiv Detail & Related papers (2024-05-08T14:49:27Z)
Multi-layer random features and the approximation power of neural networks [4.178980693837599]
We prove that a reproducing kernel Hilbert space contains only functions that can be approximated by the architecture. We show that if eigenvalues of the integral operator of the NNGP decay slower than $k-n-frac23$ where $k$ is an order of an eigenvalue, our theorem guarantees a more succinct neural network approximation than Barron's theorem.
arXiv Detail & Related papers (2024-04-26T14:57:56Z)
A Classification of $G$-invariant Shallow Neural Networks [1.4213973379473654]
We prove a theorem that gives a classification of all $G$-invariant single-hidden-layer or "shallow" neural network ($G$-SNN) architectures with ReLU activation. We enumerate the $G$-SNN architectures for some example groups $G$ and visualize their structure.
arXiv Detail & Related papers (2022-05-18T21:18:16Z)
Exploring the Common Principal Subspace of Deep Features in Neural Networks [50.37178960258464]
We find that different Deep Neural Networks (DNNs) trained with the same dataset share a common principal subspace in latent spaces. Specifically, we design a new metric $mathcalP$-vector to represent the principal subspace of deep features learned in a DNN. Small angles (with cosine close to $1.0$) have been found in the comparisons between any two DNNs trained with different algorithms/architectures.
arXiv Detail & Related papers (2021-10-06T15:48:32Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
To Boost or not to Boost: On the Limits of Boosted Neural Networks [67.67776094785363]
Boosting is a method for learning an ensemble of classifiers. While boosting has been shown to be very effective for decision trees, its impact on neural networks has not been extensively studied. We find that a single neural network usually generalizes better than a boosted ensemble of smaller neural networks with the same total number of parameters.
arXiv Detail & Related papers (2021-07-28T19:10:03Z)
Learning Syllogism with Euler Neural-Networks [20.47827965932698]
The central vector of a ball is a vector that can inherit representation power of traditional neural network. A novel back-propagation algorithm with six Rectified Spatial Units (ReSU) can optimize an Euler diagram representing logical premises. In contrast to traditional neural network, ENN can precisely represent all 24 different structures of Syllogism.
arXiv Detail & Related papers (2020-07-14T19:35:35Z)
Learning Reasoning Strategies in End-to-End Differentiable Proving [50.9791149533921]
Conditional Theorem Provers learn optimal rule selection strategy via gradient-based optimisation. We show that Conditional Theorem Provers are scalable and yield state-of-the-art results on the CLUTRR dataset.
arXiv Detail & Related papers (2020-07-13T16:22:14Z)
Deep Polynomial Neural Networks [77.70761658507507]
$Pi$Nets are a new class of function approximators based on expansions. $Pi$Nets produce state-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning.
arXiv Detail & Related papers (2020-06-20T16:23:32Z)
Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes. It allows computing statistical quantities that are in general difficult to compute. We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)
Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning [66.05472746340142]
This paper analyzes how multi-layer neural networks can perform hierarchical learning _efficiently_ and _automatically_ by SGD on the training objective. We establish a new principle called "backward feature correction", where the errors in the lower-level features can be automatically corrected when training together with the higher-level layers.
arXiv Detail & Related papers (2020-01-13T17:28:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.