Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval
- URL: http://arxiv.org/abs/2411.08590v1
- Date: Wed, 13 Nov 2024 13:13:07 GMT
- Title: Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval
- Authors: Saul Santos, Vlad Niculae, Daniel McNamee, André F. T. Martins,
- Abstract summary: Associative memory models, such as Hopfield networks, have garnered renewed interest due to advancements in memory capacity and connections with self-attention in transformers.
In this work, we introduce a unified framework-Hopfield-Fenchel-Young networks-which generalizes these models to a broader family of energy functions.
- Score: 25.841394444834933
- License:
- Abstract: Associative memory models, such as Hopfield networks and their modern variants, have garnered renewed interest due to advancements in memory capacity and connections with self-attention in transformers. In this work, we introduce a unified framework-Hopfield-Fenchel-Young networks-which generalizes these models to a broader family of energy functions. Our energies are formulated as the difference between two Fenchel-Young losses: one, parameterized by a generalized entropy, defines the Hopfield scoring mechanism, while the other applies a post-transformation to the Hopfield output. By utilizing Tsallis and norm entropies, we derive end-to-end differentiable update rules that enable sparse transformations, uncovering new connections between loss margins, sparsity, and exact retrieval of single memory patterns. We further extend this framework to structured Hopfield networks using the SparseMAP transformation, allowing the retrieval of pattern associations rather than a single pattern. Our framework unifies and extends traditional and modern Hopfield networks and provides an energy minimization perspective for widely used post-transformations like $\ell_2$-normalization and layer normalization-all through suitable choices of Fenchel-Young losses and by using convex analysis as a building block. Finally, we validate our Hopfield-Fenchel-Young networks on diverse memory recall tasks, including free and sequential recall. Experiments on simulated data, image retrieval, multiple instance learning, and text rationalization demonstrate the effectiveness of our approach.
Related papers
- Dense Associative Memory Through the Lens of Random Features [48.17520168244209]
Dense Associative Memories are high storage capacity variants of the Hopfield networks.
We show that this novel network closely approximates the energy function and dynamics of conventional Dense Associative Memories.
arXiv Detail & Related papers (2024-10-31T17:10:57Z) - Nonparametric Modern Hopfield Models [12.160725212848137]
We present a nonparametric construction for deep learning compatible modern Hopfield models.
Key contribution stems from interpreting the memory storage and retrieval processes in modern Hopfield models.
We introduce textitsparse-structured modern Hopfield models with sub-quadratic complexity.
arXiv Detail & Related papers (2024-04-05T05:46:20Z) - Sparse and Structured Hopfield Networks [14.381907888022612]
We provide a unified framework for sparse Hopfield networks by establishing a link with Fenchel-Young losses.
We reveal a connection between loss margins, sparsity, and exact memory retrieval.
Experiments on multiple instance learning and text rationalization demonstrate the usefulness of our approach.
arXiv Detail & Related papers (2024-02-21T11:35:45Z) - STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series
Prediction [13.815793371488613]
We present a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations.
In essence, STanHop sequentially learn temporal representation and cross-series representation using two tandem sparse Hopfield layers.
We show that our framework endows a tighter memory retrieval error compared to the dense counterpart without sacrificing memory capacity.
arXiv Detail & Related papers (2023-12-28T20:26:23Z) - Simplicial Hopfield networks [0.0]
We extend Hopfield networks by adding setwise connections and embedding these connections in a simplicial complex.
We show that our simplicial Hopfield networks increase memory storage capacity.
We also test analogous modern continuous Hopfield networks, offering a potentially promising avenue for improving the attention mechanism in Transformer models.
arXiv Detail & Related papers (2023-05-09T05:23:04Z) - PnP-DETR: Towards Efficient Visual Analysis with Transformers [146.55679348493587]
Recently, DETR pioneered the solution vision tasks with transformers, it directly translates the image feature map into the object result.
Recent transformer-based image recognition model andTT show consistent efficiency gain.
arXiv Detail & Related papers (2021-09-15T01:10:30Z) - Semantic Correspondence with Transformers [68.37049687360705]
We propose Cost Aggregation with Transformers (CATs) to find dense correspondences between semantically similar images.
We include appearance affinity modelling to disambiguate the initial correlation maps and multi-level aggregation.
We conduct experiments to demonstrate the effectiveness of the proposed model over the latest methods and provide extensive ablation studies.
arXiv Detail & Related papers (2021-06-04T14:39:03Z) - Mean Field Game GAN [55.445402222849474]
We propose a novel mean field games (MFGs) based GAN(generative adversarial network) framework.
We utilize the Hopf formula in density space to rewrite MFGs as a primal-dual problem so that we are able to train the model via neural networks and samples.
arXiv Detail & Related papers (2021-03-14T06:34:38Z) - Hopfield Networks is All You Need [8.508381229662907]
We introduce a modern Hopfield network with continuous states and a corresponding update rule.
The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors.
We demonstrate the broad applicability of the Hopfield layers across various domains.
arXiv Detail & Related papers (2020-07-16T17:52:37Z) - Feature Transformation Ensemble Model with Batch Spectral Regularization
for Cross-Domain Few-Shot Classification [66.91839845347604]
We propose an ensemble prediction model by performing diverse feature transformations after a feature extraction network.
We use a batch spectral regularization term to suppress the singular values of the feature matrix during pre-training to improve the generalization ability of the model.
The proposed model can then be fine tuned in the target domain to address few-shot classification.
arXiv Detail & Related papers (2020-05-18T05:31:04Z) - Targeted free energy estimation via learned mappings [66.20146549150475]
Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences.
FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions.
One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap.
arXiv Detail & Related papers (2020-02-12T11:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.