Hebbian Learning from First Principles
- URL: http://arxiv.org/abs/2401.07110v2
- Date: Thu, 03 Oct 2024 15:12:40 GMT
- Title: Hebbian Learning from First Principles
- Authors: Linda Albanese, Adriano Barra, Pierluigi Bianco, Fabrizio Durante, Diego Pallara,
- Abstract summary: We postulating the expression of its Hamiltonian for supervised and unsupervised protocols.
We show how Lagrangian constraints within entropy extremization force network's outcomes on neural correlations.
Remarks on the exponential Hopfield model (as the limit of dense networks with diverging density) and semi-supervised protocols are also provided.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, the original storage prescription for the Hopfield model of neural networks -- as well as for its dense generalizations -- has been turned into a genuine Hebbian learning rule by postulating the expression of its Hamiltonian for both the supervised and unsupervised protocols. In these notes, first, we obtain these explicit expressions by relying upon maximum entropy extremization \`a la Jaynes. Beyond providing a formal derivation of these recipes for Hebbian learning, this construction also highlights how Lagrangian constraints within entropy extremization force network's outcomes on neural correlations: these try to mimic the empirical counterparts hidden in the datasets provided to the network for its training and, the denser the network, the longer the correlations that it is able to capture. Next, we prove that, in the big data limit, whatever the presence of a teacher (or its lacking), not only these Hebbian learning rules converge to the original storage prescription of the Hopfield model but also their related free energies (and, thus, the statistical mechanical picture provided by Amit, Gutfreund and Sompolinsky is fully recovered). As a sideline, we show mathematical equivalence among standard Cost functions (Hamiltonian), preferred in Statistical Mechanical jargon, and quadratic Loss Functions, preferred in Machine Learning terminology. Remarks on the exponential Hopfield model (as the limit of dense networks with diverging density) and semi-supervised protocols are also provided.
Related papers
- Revealing Decurve Flows for Generalized Graph Propagation [108.80758541147418]
This study addresses the limitations of the traditional analysis of message-passing, central to graph learning, by defining em textbfgeneralized propagation with directed and weighted graphs.
We include a preliminary exploration of learned propagation patterns in datasets, a first in the field.
arXiv Detail & Related papers (2024-02-13T14:13:17Z) - Statistical Mechanics of Learning via Reverberation in Bidirectional
Associative Memories [0.0]
We study bi-directional associative neural networks that are exposed to noisy examples of random archetypes.
In this setting, learning is heteroassociative -- involving couples of patterns -- and it is achieved by reverberating the information depicted from the examples.
arXiv Detail & Related papers (2023-07-17T10:04:04Z) - Fundamental limits of overparametrized shallow neural networks for
supervised learning [11.136777922498355]
We study a two-layer neural network trained from input-output pairs generated by a teacher network with matching architecture.
Our results come in the form of bounds relating i) the mutual information between training data and network weights, or ii) the Bayes-optimal generalization error.
arXiv Detail & Related papers (2023-07-11T08:30:50Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Dense Hebbian neural networks: a replica symmetric picture of supervised
learning [4.133728123207142]
We consider dense, associative neural-networks trained by a teacher with supervision.
We investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations.
arXiv Detail & Related papers (2022-11-25T13:37:47Z) - Neural network enhanced measurement efficiency for molecular
groundstates [63.36515347329037]
We adapt common neural network models to learn complex groundstate wavefunctions for several molecular qubit Hamiltonians.
We find that using a neural network model provides a robust improvement over using single-copy measurement outcomes alone to reconstruct observables.
arXiv Detail & Related papers (2022-06-30T17:45:05Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse
in Imbalanced Training [39.137793683411424]
We introduce the textitLayer-Peeled Model, a non-yet analytically tractable optimization program.
We show that the model inherits many characteristics of well-trained networks, thereby offering an effective tool for explaining and predicting common empirical patterns of deep learning training.
In particular, we show that the model reveals a hitherto unknown phenomenon that we term textitMinority Collapse, which fundamentally limits the performance of deep learning models on the minority classes.
arXiv Detail & Related papers (2021-01-29T17:37:17Z) - Emergence of a finite-size-scaling function in the supervised learning
of the Ising phase transition [0.7658140759553149]
We investigate the connection between the supervised learning of the binary phase classification in the ferromagnetic Ising model and the standard finite-size-scaling theory of the second-order phase transition.
We show that just one free parameter is capable enough to describe the data-driven emergence of the universal finite-size-scaling function in the network output.
arXiv Detail & Related papers (2020-10-01T12:34:12Z) - Hyperbolic Neural Networks++ [66.16106727715061]
We generalize the fundamental components of neural networks in a single hyperbolic geometry model, namely, the Poincar'e ball model.
Experiments show the superior parameter efficiency of our methods compared to conventional hyperbolic components, and stability and outperformance over their Euclidean counterparts.
arXiv Detail & Related papers (2020-06-15T08:23:20Z) - Parsimonious neural networks learn interpretable physical laws [77.34726150561087]
We propose parsimonious neural networks (PNNs) that combine neural networks with evolutionary optimization to find models that balance accuracy with parsimony.
The power and versatility of the approach is demonstrated by developing models for classical mechanics and to predict the melting temperature of materials from fundamental properties.
arXiv Detail & Related papers (2020-05-08T16:15:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.