Training Neural Networks by Optimizing Neuron Positions
- URL: http://arxiv.org/abs/2506.13410v1
- Date: Mon, 16 Jun 2025 12:26:13 GMT
- Title: Training Neural Networks by Optimizing Neuron Positions
- Authors: Laura Erb, Tommaso Boccato, Alexandru Vasilache, Juergen Becker, Nicola Toschi,
- Abstract summary: We propose a parameter-efficient neural architecture where neurons are embedded in Euclidean space.<n>During training, their positions are optimized and synaptic weights are determined as the inverse of the spatial distance between connected neurons.<n>These distance-dependent wiring rules replace traditional learnable weight matrices and significantly reduce the number of parameters while introducing a biologically inspired inductive bias.
- Score: 39.682133213072554
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The high computational complexity and increasing parameter counts of deep neural networks pose significant challenges for deployment in resource-constrained environments, such as edge devices or real-time systems. To address this, we propose a parameter-efficient neural architecture where neurons are embedded in Euclidean space. During training, their positions are optimized and synaptic weights are determined as the inverse of the spatial distance between connected neurons. These distance-dependent wiring rules replace traditional learnable weight matrices and significantly reduce the number of parameters while introducing a biologically inspired inductive bias: connection strength decreases with spatial distance, reflecting the brain's embedding in three-dimensional space where connections tend to minimize wiring length. We validate this approach for both multi-layer perceptrons and spiking neural networks. Through a series of experiments, we demonstrate that these spatially embedded neural networks achieve a performance competitive with conventional architectures on the MNIST dataset. Additionally, the models maintain performance even at pruning rates exceeding 80% sparsity, outperforming traditional networks with the same number of parameters under similar conditions. Finally, the spatial embedding framework offers an intuitive visualization of the network structure.
Related papers
- SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space [6.2241272327831485]
We propose a framework that simultaneously optimize both the architecture and the weights of a neural network.<n>Our framework first trains a universal multi-scale autoencoder that embeds both architectural and parametric information into a continuous latent space.<n>Given a dataset, we then randomly initialize a point in the embedding space and update it via gradient descent to obtain the optimal neural network.
arXiv Detail & Related papers (2025-06-09T22:22:37Z) - Lorentzian Residual Neural Networks [15.257990326035694]
We introduce LResNet, a novel Lorentzian residual neural network based on the weighted Lorentzian centroid in the Lorentz model of hyperbolic geometry.<n>Our method enables the efficient integration of residual connections in hyperbolic neural networks while preserving their hierarchical representation capabilities.<n>Our findings highlight the potential of LResNet for building more expressive neural networks in hyperbolic embedding space.
arXiv Detail & Related papers (2024-12-19T09:56:01Z) - Hybrid deep additive neural networks [0.0]
We introduce novel deep neural networks that incorporate the idea of additive regression.<n>Our neural networks share architectural similarities with Kolmogorov-Arnold networks but are based on simpler yet flexible activation and basis functions.<n>We derive their universal approximation properties and demonstrate their effectiveness through simulation studies and a real-data application.
arXiv Detail & Related papers (2024-11-14T04:26:47Z) - Systematic construction of continuous-time neural networks for linear dynamical systems [0.0]
We discuss a systematic approach to constructing neural architectures for modeling a subclass of dynamical systems.
We use a variant of continuous-time neural networks in which the output of each neuron evolves continuously as a solution of a first-order or second-order Ordinary Differential Equation (ODE)
Instead of deriving the network architecture and parameters from data, we propose a gradient-free algorithm to compute sparse architecture and network parameters directly from the given LTI system.
arXiv Detail & Related papers (2024-03-24T16:16:41Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Quasi-orthogonality and intrinsic dimensions as measures of learning and
generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants.
Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.