Unsupervised and Supervised learning by Dense Associative Memory under
replica symmetry breaking
- URL: http://arxiv.org/abs/2312.09638v1
- Date: Fri, 15 Dec 2023 09:27:46 GMT
- Title: Unsupervised and Supervised learning by Dense Associative Memory under
replica symmetry breaking
- Authors: Linda Albanese, Andrea Alessandrelli, Alessia Annibale, Adriano Barra
- Abstract summary: Hebbian attractor networks with multi-node interactions have been shown to outperform classical pairwise counterparts in a number of tasks.
We derive the one-step broken-replica-symmetry picture of supervised and unsupervised learning protocols for these Associative Memories.
- Score: 0.24999074238880487
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Statistical mechanics of spin glasses is one of the main strands toward a
comprehension of information processing by neural networks and learning
machines. Tackling this approach, at the fairly standard replica symmetric
level of description, recently Hebbian attractor networks with multi-node
interactions (often called Dense Associative Memories) have been shown to
outperform their classical pairwise counterparts in a number of tasks, from
their robustness against adversarial attacks and their capability to work with
prohibitively weak signals to their supra-linear storage capacities. Focusing
on mathematical techniques more than computational aspects, in this paper we
relax the replica symmetric assumption and we derive the one-step
broken-replica-symmetry picture of supervised and unsupervised learning
protocols for these Dense Associative Memories: a phase diagram in the space of
the control parameters is achieved, independently, both via the Parisi's
hierarchy within then replica trick as well as via the Guerra's telescope
within the broken-replica interpolation. Further, an explicit analytical
investigation is provided to deepen both the big-data and ground state limits
of these networks as well as a proof that replica symmetry breaking does not
alter the thresholds for learning and slightly increases the maximal storage
capacity. Finally the De Almeida and Thouless line, depicting the onset of
instability of a replica symmetric description, is also analytically derived
highlighting how, crossed this boundary, the broken replica description should
be preferred.
Related papers
- Measurement Induced Dynamics and Trace Preserving Replica Cutoffs [0.0]
We present a general methodology for addressing the infinite hierarchy problem that arises in measurement-induced dynamics of replicated quantum systems.
Our approach introduces trace-preserving replica cutoffs using tomographic-like techniques to estimate higher-order replica states from lower ones.
This guarantees that the dynamics of single-replica systems correctly reduce to standard Lindblad evolution.
arXiv Detail & Related papers (2025-04-01T17:20:42Z) - Learning Broken Symmetries with Approximate Invariance [1.0485739694839669]
In many cases, the exact underlying symmetry is present only in an idealized dataset, and is broken in actual data.
Standard approaches, such as data augmentation or equivariant networks fail to represent the nature of the full, broken symmetry.
We propose a learning model which balances the generality and performance of unconstrained networks with the rapid learning of constrained networks.
arXiv Detail & Related papers (2024-12-25T04:29:04Z) - Fundamental operating regimes, hyper-parameter fine-tuning and glassiness: towards an interpretable replica-theory for trained restricted Boltzmann machines [0.0]
We consider Boltzmann machines with a binary visible layer and a Gaussian hidden layer trained by an unlabelled dataset composed of noisy realizations of a single ground pattern.
We develop a statistical mechanics framework to describe the network generative capabilities, by exploiting the replica trick and assuming self-averaging of the underlying order parameters.
arXiv Detail & Related papers (2024-06-14T11:12:00Z) - Symmetry-protection Zeno phase transition in monitored lattice gauge theories [0.0]
We show the existence of a sharp transition, triggered by the measurement rate, between a protected gauge-theory regime and an irregular regime.
Our results shed light on the dissipative criticality of strongly-interacting, highly-constrained quantum systems.
arXiv Detail & Related papers (2024-05-28T18:18:06Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance.
symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted.
Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z) - Statistical Mechanics of Learning via Reverberation in Bidirectional
Associative Memories [0.0]
We study bi-directional associative neural networks that are exposed to noisy examples of random archetypes.
In this setting, learning is heteroassociative -- involving couples of patterns -- and it is achieved by reverberating the information depicted from the examples.
arXiv Detail & Related papers (2023-07-17T10:04:04Z) - Long Sequence Hopfield Memory [32.28395813801847]
Sequence memory enables agents to encode, store, and retrieve complex sequences of stimuli and actions.
We introduce a nonlinear interaction term, enhancing separation between the patterns.
We extend this model to store sequences with variable timing between states' transitions.
arXiv Detail & Related papers (2023-06-07T15:41:03Z) - Annihilation of Spurious Minima in Two-Layer ReLU Networks [9.695960412426672]
We study the optimization problem associated with fitting two-layer ReLU neural networks with respect to the squared loss.
We show that adding neurons can turn symmetric spurious minima into saddles.
We also prove the existence of descent directions in certain subspaces arising from the symmetry structure of the loss function.
arXiv Detail & Related papers (2022-10-12T11:04:21Z) - Symmetric Pruning in Quantum Neural Networks [111.438286016951]
Quantum neural networks (QNNs) exert the power of modern quantum machines.
QNNs with handcraft symmetric ansatzes generally experience better trainability than those with asymmetric ansatzes.
We propose the effective quantum neural tangent kernel (EQNTK) to quantify the convergence of QNNs towards the global optima.
arXiv Detail & Related papers (2022-08-30T08:17:55Z) - Boundary theories of critical matchgate tensor networks [59.433172590351234]
Key aspects of the AdS/CFT correspondence can be captured in terms of tensor network models on hyperbolic lattices.
For tensors fulfilling the matchgate constraint, these have previously been shown to produce disordered boundary states.
We show that these Hamiltonians exhibit multi-scale quasiperiodic symmetries captured by an analytical toy model.
arXiv Detail & Related papers (2021-10-06T18:00:03Z) - Sampling asymmetric open quantum systems for artificial neural networks [77.34726150561087]
We present a hybrid sampling strategy which takes asymmetric properties explicitly into account, achieving fast convergence times and high scalability for asymmetric open systems.
We highlight the universal applicability of artificial neural networks, underlining the universal applicability of neural networks.
arXiv Detail & Related papers (2020-12-20T18:25:29Z) - Asymmetric GANs for Image-to-Image Translation [62.49892218126542]
Existing models for Generative Adversarial Networks (GANs) learn the mapping from the source domain to the target domain using a cycle-consistency loss.
We propose an AsymmetricGAN model with both translation and reconstruction generators of unequal sizes and different parameter-sharing strategy.
Experiments on both supervised and unsupervised generative tasks with 8 datasets show that AsymmetricGAN achieves superior model capacity and better generation performance.
arXiv Detail & Related papers (2019-12-14T21:24:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.