On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis
- URL: http://arxiv.org/abs/2402.04520v5
- Date: Sat, 1 Jun 2024 00:49:17 GMT
- Title: On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis
- Authors: Jerry Yao-Chieh Hu, Thomas Lin, Zhao Song, Han Liu,
- Abstract summary: We investigate the computational limits of the memory retrieval dynamics of modern Hopfield models.
We establish an upper bound criterion for the norm of input query patterns and memory patterns.
We prove its memory retrieval error bound and exponential memory capacity.
- Score: 12.72277128564391
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We investigate the computational limits of the memory retrieval dynamics of modern Hopfield models from the fine-grained complexity analysis. Our key contribution is the characterization of a phase transition behavior in the efficiency of all possible modern Hopfield models based on the norm of patterns. Specifically, we establish an upper bound criterion for the norm of input query patterns and memory patterns. Only below this criterion, sub-quadratic (efficient) variants of the modern Hopfield model exist, assuming the Strong Exponential Time Hypothesis (SETH). To showcase our theory, we provide a formal example of efficient constructions of modern Hopfield models using low-rank approximation when the efficient criterion holds. This includes a derivation of a lower bound on the computational time, scaling linearly with $\max\{$# of stored memory patterns, length of input query sequence$\}$. In addition, we prove its memory retrieval error bound and exponential memory capacity.
Related papers
- Provably Optimal Memory Capacity for Modern Hopfield Models: Transformer-Compatible Dense Associative Memories as Spherical Codes [6.477597248683852]
We study the optimal capacity memorization of modern Hopfield models and Kernelized Hopfield Models (KHMs)
We show that the optimal capacity of KHMs occurs when the feature space allows memories to form an optimal spherical code.
arXiv Detail & Related papers (2024-10-30T15:35:51Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Nonparametric Modern Hopfield Models [12.160725212848137]
We present a nonparametric construction for deep learning compatible modern Hopfield models.
Key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models.
We introduce textitsparse-structured modern Hopfield models with sub-quadratic complexity.
arXiv Detail & Related papers (2024-04-05T05:46:20Z) - Fast Shapley Value Estimation: A Unified Approach [71.92014859992263]
We propose a straightforward and efficient Shapley estimator, SimSHAP, by eliminating redundant techniques.
In our analysis of existing approaches, we observe that estimators can be unified as a linear transformation of randomly summed values from feature subsets.
Our experiments validate the effectiveness of our SimSHAP, which significantly accelerates the computation of accurate Shapley values.
arXiv Detail & Related papers (2023-11-02T06:09:24Z) - On Sparse Modern Hopfield Model [12.288884253562845]
We introduce the sparse modern Hopfield model as a sparse extension of the modern Hopfield model.
We show that the sparse modern Hopfield model maintains the robust theoretical properties of its dense counterpart.
arXiv Detail & Related papers (2023-09-22T07:32:45Z) - Predicting Ordinary Differential Equations with Transformers [65.07437364102931]
We develop a transformer-based sequence-to-sequence model that recovers scalar ordinary differential equations (ODEs) in symbolic form from irregularly sampled and noisy observations of a single solution trajectory.
Our method is efficiently scalable: after one-time pretraining on a large set of ODEs, we can infer the governing law of a new observed solution in a few forward passes of the model.
arXiv Detail & Related papers (2023-07-24T08:46:12Z) - Structured Optimal Variational Inference for Dynamic Latent Space Models [16.531262817315696]
We consider a latent space model for dynamic networks, where our objective is to estimate the pairwise inner products plus the intercept of the latent positions.
To balance posterior inference and computational scalability, we consider a structured mean-field variational inference framework.
arXiv Detail & Related papers (2022-09-29T22:10:42Z) - Adaptive LASSO estimation for functional hidden dynamic geostatistical
model [69.10717733870575]
We propose a novel model selection algorithm based on a penalized maximum likelihood estimator (PMLE) for functional hiddenstatistical models (f-HD)
The algorithm is based on iterative optimisation and uses an adaptive least absolute shrinkage and selector operator (GMSOLAS) penalty function, wherein the weights are obtained by the unpenalised f-HD maximum-likelihood estimators.
arXiv Detail & Related papers (2022-08-10T19:17:45Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Goal-directed Generation of Discrete Structures with Conditional
Generative Models [85.51463588099556]
We introduce a novel approach to directly optimize a reinforcement learning objective, maximizing an expected reward.
We test our methodology on two tasks: generating molecules with user-defined properties and identifying short python expressions which evaluate to a given target value.
arXiv Detail & Related papers (2020-10-05T20:03:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.