On Sparse Modern Hopfield Model
- URL: http://arxiv.org/abs/2309.12673v2
- Date: Wed, 29 Nov 2023 22:45:39 GMT
- Title: On Sparse Modern Hopfield Model
- Authors: Jerry Yao-Chieh Hu, Donglin Yang, Dennis Wu, Chenwei Xu, Bo-Yu Chen,
Han Liu
- Abstract summary: We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model.
We show that the sparse modern Hopfield model maintains the robust theoretical properties of its dense counterpart.
- Score: 12.288884253562845
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce the sparse modern Hopfield model as a sparse extension of the
modern Hopfield model. Like its dense counterpart, the sparse modern Hopfield
model equips a memory-retrieval dynamics whose one-step approximation
corresponds to the sparse attention mechanism. Theoretically, our key
contribution is a principled derivation of a closed-form sparse Hopfield energy
using the convex conjugate of the sparse entropic regularizer. Building upon
this, we derive the sparse memory retrieval dynamics from the sparse energy
function and show its one-step approximation is equivalent to the
sparse-structured attention. Importantly, we provide a sparsity-dependent
memory retrieval error bound which is provably tighter than its dense analog.
The conditions for the benefits of sparsity to arise are therefore identified
and discussed. In addition, we show that the sparse modern Hopfield model
maintains the robust theoretical properties of its dense counterpart, including
rapid fixed point convergence and exponential memory capacity. Empirically, we
use both synthetic and real-world datasets to demonstrate that the sparse
Hopfield model outperforms its dense counterpart in many situations.
Related papers
- Provably Optimal Memory Capacity for Modern Hopfield Models: Transformer-Compatible Dense Associative Memories as Spherical Codes [6.477597248683852]
We study the optimal capacity memorization of modern Hopfield models and Kernelized Hopfield Models (KHMs)
We show that the optimal capacity of KHMs occurs when the feature space allows memories to form an optimal spherical code.
arXiv Detail & Related papers (2024-10-30T15:35:51Z) - Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Nonparametric Modern Hopfield Models [12.160725212848137]
We present a nonparametric construction for deep learning compatible modern Hopfield models.
Key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models.
We introduce textitsparse-structured modern Hopfield models with sub-quadratic complexity.
arXiv Detail & Related papers (2024-04-05T05:46:20Z) - On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis [12.72277128564391]
We investigate the computational limits of the memory retrieval dynamics of modern Hopfield models.
We establish an upper bound criterion for the norm of input query patterns and memory patterns.
We prove its memory retrieval error bound and exponential memory capacity.
arXiv Detail & Related papers (2024-02-07T01:58:21Z) - Geometric Neural Diffusion Processes [55.891428654434634]
We extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling.
We show that with these conditions, the generative functional model admits the same symmetry.
arXiv Detail & Related papers (2023-07-11T16:51:38Z) - ChiroDiff: Modelling chirographic data with Diffusion Models [132.5223191478268]
We introduce a powerful model-class namely "Denoising Diffusion Probabilistic Models" or DDPMs for chirographic data.
Our model named "ChiroDiff", being non-autoregressive, learns to capture holistic concepts and therefore remains resilient to higher temporal sampling rate.
arXiv Detail & Related papers (2023-04-07T15:17:48Z) - Convex Analysis of the Mean Field Langevin Dynamics [49.66486092259375]
convergence rate analysis of the mean field Langevin dynamics is presented.
$p_q$ associated with the dynamics allows us to develop a convergence theory parallel to classical results in convex optimization.
arXiv Detail & Related papers (2022-01-25T17:13:56Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Real-Time Evolution in the Hubbard Model with Infinite Repulsion [0.0]
We consider the real-time evolution of the Hubbard model in the limit of infinite coupling.
We show that the quench dynamics from product states in the occupation basis can be determined exactly in terms of correlations in the tight-binding model.
arXiv Detail & Related papers (2021-09-30T17:51:01Z) - Fast approximations in the homogeneous Ising model for use in scene
analysis [61.0951285821105]
We provide accurate approximations that make it possible to numerically calculate quantities needed in inference.
We show that our approximation formulae are scalable and unfazed by the size of the Markov Random Field.
The practical import of our approximation formulae is illustrated in performing Bayesian inference in a functional Magnetic Resonance Imaging activation detection experiment, and also in likelihood ratio testing for anisotropy in the spatial patterns of yearly increases in pistachio tree yields.
arXiv Detail & Related papers (2017-12-06T14:24:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.