On Sparse Modern Hopfield Model
- URL: http://arxiv.org/abs/2309.12673v2
- Date: Wed, 29 Nov 2023 22:45:39 GMT
- Title: On Sparse Modern Hopfield Model
- Authors: Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen,
Han Liu
- Abstract summary: We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model.
We show that the sparse modern Hopfield model maintains the robust theoretical properties of its dense counterpart.
- Score: 12.288884253562845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the sparse modern Hopfield model as a sparse extension of the
modern Hopfield model. Like its dense counterpart, the sparse modern Hopfield
model equips a memory-retrieval dynamics whose one-step approximation
corresponds to the sparse attention mechanism. Theoretically, our key
contribution is a principled derivation of a closed-form sparse Hopfield energy
using the convex conjugate of the sparse entropic regularizer. Building upon
this, we derive the sparse memory retrieval dynamics from the sparse energy
function and show its one-step approximation is equivalent to the
sparse-structured attention. Importantly, we provide a sparsity-dependent
memory retrieval error bound which is provably tighter than its dense analog.
The conditions for the benefits of sparsity to arise are therefore identified
and discussed. In addition, we show that the sparse modern Hopfield model
maintains the robust theoretical properties of its dense counterpart, including
rapid fixed point convergence and exponential memory capacity. Empirically, we
use both synthetic and real-world datasets to demonstrate that the sparse
Hopfield model outperforms its dense counterpart in many situations.
Related papers
- Nonparametric Modern Hopfield Models [12.160725212848137]
We present a nonparametric construction for deep learning compatible modern Hopfield models.
Key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models.
We introduce textitsparse-structured modern Hopfield models with sub-quadratic complexity.
arXiv Detail & Related papers (2024-04-05T05:46:20Z) - On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis [12.72277128564391]
We investigate the computational limits of the memory retrieval dynamics of modern Hopfield models.
We establish an upper bound criterion for the norm of input query patterns and memory patterns.
We prove its memory retrieval error bound and exponential memory capacity.
arXiv Detail & Related papers (2024-02-07T01:58:21Z) - STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series
Prediction [13.815793371488613]
We present a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations.
In essence, STanHop sequentially learn temporal representation and cross-series representation using two tandem sparse Hopfield layers.
We show that our framework endows a tighter memory retrieval error compared to the dense counterpart without sacrificing memory capacity.
arXiv Detail & Related papers (2023-12-28T20:26:23Z) - Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented.
$p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Boundary theories of critical matchgate tensor networks [59.433172590351234]
Key aspects of the AdS/CFT correspondence can be captured in terms of tensor network models on hyperbolic lattices.
For tensors fulfilling the matchgate constraint, these have previously been shown to produce disordered boundary states.
We show that these Hamiltonians exhibit multi-scale quasiperiodic symmetries captured by an analytical toy model.
arXiv Detail & Related papers (2021-10-06T18:00:03Z) - Real-Time Evolution in the Hubbard Model with Infinite Repulsion [0.0]
We consider the real-time evolution of the Hubbard model in the limit of infinite coupling.
We show that the quench dynamics from product states in the occupation basis can be determined exactly in terms of correlations in the tight-binding model.
arXiv Detail & Related papers (2021-09-30T17:51:01Z) - Fast approximations in the homogeneous Ising model for use in scene
analysis [61.0951285821105]
We provide accurate approximations that make it possible to numerically calculate quantities needed in inference.
We show that our approximation formulae are scalable and unfazed by the size of the Markov Random Field.
The practical import of our approximation formulae is illustrated in performing Bayesian inference in a functional Magnetic Resonance Imaging activation detection experiment, and also in likelihood ratio testing for anisotropy in the spatial patterns of yearly increases in pistachio tree yields.
arXiv Detail & Related papers (2017-12-06T14:24:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.