Related papers: Hopfield Networks is All You Need

Hopfield Networks is All You Need

URL: http://arxiv.org/abs/2008.02217v3
Date: Wed, 28 Apr 2021 07:24:49 GMT
Title: Hopfield Networks is All You Need
Authors: Hubert Ramsauer, Bernhard Sch\"afl, Johannes Lehner, Philipp Seidl, Michael Widrich, Thomas Adler, Lukas Gruber, Markus Holzleitner, Milena Pavlovi\'c, Geir Kjetil Sandve, Victor Greiff, David Kreil, Michael Kopp, G\"unter Klambauer, Johannes Brandstetter, Sepp Hochreiter
Abstract summary: We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. We demonstrate the broad applicability of the Hopfield layers across various domains.
Score: 8.508381229662907
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a modern Hopfield network with continuous states and a corresponding update rule. The new Hopfield network can store exponentially (with the dimension of the associative space) many patterns, retrieves the pattern with one update, and has exponentially small retrieval errors. It has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points which store a single pattern. The new update rule is equivalent to the attention mechanism used in transformers. This equivalence enables a characterization of the heads of transformer models. These heads perform in the first layers preferably global averaging and in higher layers partial averaging via metastable states. The new modern Hopfield network can be integrated into deep learning architectures as layers to allow the storage of and access to raw input data, intermediate results, or learned prototypes. These Hopfield layers enable new ways of deep learning, beyond fully-connected, convolutional, or recurrent networks, and provide pooling, memory, association, and attention mechanisms. We demonstrate the broad applicability of the Hopfield layers across various domains. Hopfield layers improved state-of-the-art on three out of four considered multiple instance learning problems as well as on immune repertoire classification with several hundreds of thousands of instances. On the UCI benchmark collections of small classification tasks, where deep learning methods typically struggle, Hopfield layers yielded a new state-of-the-art when compared to different machine learning methods. Finally, Hopfield layers achieved state-of-the-art on two drug design datasets. The implementation is available at: https://github.com/ml-jku/hopfield-layers

Related papers

PARTFIELD: Learning 3D Feature Fields for Part Segmentation and Beyond [70.95930509071451]
PartField is a feedforward approach for learning part-based 3D features. PartField is up to 20% more accurate and often orders of magnitude faster than other recent class-agnostic part-segmentation methods.
arXiv Detail & Related papers (2025-04-15T17:58:16Z)
Classifying States of the Hopfield Network with Improved Accuracy, Generalization, and Interpretability [1.2289361708127877]
We study the generalizability of different classification models when trained on states derived from different prototype tasks. We find that simple models often outperform the stability ratio while remaining interpretable.
arXiv Detail & Related papers (2025-03-04T21:29:42Z)
Hopfield-Fenchel-Young Networks: A Unified Framework for Associative Memory Retrieval [25.841394444834933]
Associative memory models, such as Hopfield networks, have garnered renewed interest due to advancements in memory capacity and connections with self-attention in transformers. In this work, we introduce a unified framework-Hopfield-Fenchel-Young networks-which generalizes these models to a broader family of energy functions.
arXiv Detail & Related papers (2024-11-13T13:13:07Z)
Boosting Cross-Domain Point Classification via Distilling Relational Priors from 2D Transformers [59.0181939916084]
Traditional 3D networks mainly focus on local geometric details and ignore the topological structure between local geometries. We propose a novel Priors Distillation (RPD) method to extract priors from the well-trained transformers on massive images. Experiments on the PointDA-10 and the Sim-to-Real datasets verify that the proposed method consistently achieves the state-of-the-art performance of UDA for point cloud classification.
arXiv Detail & Related papers (2024-07-26T06:29:09Z)
SegVG: Transferring Object Bounding Box to Segmentation for Visual Grounding [56.079013202051094]
We present SegVG, a novel method transfers the box-level annotation as signals to provide an additional pixel-level supervision for Visual Grounding. This approach allows us to iteratively exploit the annotation as signals for both box-level regression and pixel-level segmentation.
arXiv Detail & Related papers (2024-07-03T15:30:45Z)
BiSHop: Bi-Directional Cellular Learning for Tabular Data with Generalized Sparse Modern Hopfield Model [6.888608574535993]
BiSHop handles the two major challenges of deep tabular learning: non-rotationally invariant data structure and feature sparsity in data. BiSHop uses a dual-component approach, sequentially processing data both column-wise and row-wise. We show BiSHop surpasses current SOTA methods with significantly less HPO runs.
arXiv Detail & Related papers (2024-04-04T23:13:32Z)
Uniform Memory Retrieval with Larger Capacity for Modern Hopfield Models [5.929540708452128]
We propose a two-stage memory retrieval dynamics for modern Hopfield models. Key contribution is a learnable feature map $Phi$ which transforms the Hopfield energy function into kernel space. It utilizes the stored memory patterns as learning data to enhance memory capacity across all modern Hopfield models.
arXiv Detail & Related papers (2024-04-04T23:05:30Z)
STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction [13.815793371488613]
We present a novel Hopfield-based neural network block, which sparsely learns and stores both temporal and cross-series representations. In essence, STanHop sequentially learn temporal representation and cross-series representation using two tandem sparse Hopfield layers. We show that our framework endows a tighter memory retrieval error compared to the dense counterpart without sacrificing memory capacity.
arXiv Detail & Related papers (2023-12-28T20:26:23Z)
Improved Convergence Guarantees for Shallow Neural Networks [91.3755431537592]
We prove convergence of depth 2 neural networks, trained via gradient descent, to a global minimum. Our model has the following features: regression with quadratic loss function, fully connected feedforward architecture, RelU activations, Gaussian data instances, adversarial labels. They strongly suggest that, at least in our model, the convergence phenomenon extends well beyond the NTK regime''
arXiv Detail & Related papers (2022-12-05T14:47:52Z)
Hierarchical Variational Memory for Few-shot Learning Across Domains [120.87679627651153]
We introduce a hierarchical prototype model, where each level of the prototype fetches corresponding information from the hierarchical memory. The model is endowed with the ability to flexibly rely on features at different semantic levels if the domain shift circumstances so demand. We conduct thorough ablation studies to demonstrate the effectiveness of each component in our model.
arXiv Detail & Related papers (2021-12-15T15:01:29Z)
Modern Hopfield Networks and Attention for Immune Repertoire Classification [8.488102471604908]
We show that the attention mechanism of transformer architectures is actually the update rule of modern Hopfield networks. We exploit this high storage capacity to solve a challenging multiple instance learning (MIL) problem in computational biology. We present our novel method DeepRC that integrates transformer-like attention, or equivalently modern Hopfield networks, into deep learning architectures.
arXiv Detail & Related papers (2020-07-16T20:35:46Z)
Fine-Grained Visual Classification with Efficient End-to-end Localization [49.9887676289364]
We present an efficient localization module that can be fused with a classification network in an end-to-end setup. We evaluate the new model on the three benchmark datasets CUB200-2011, Stanford Cars and FGVC-Aircraft.
arXiv Detail & Related papers (2020-05-11T14:07:06Z)
PointHop++: A Lightweight Learning Model on Point Sets for 3D Classification [55.887502438160304]
The PointHop method was recently proposed by Zhang et al. for 3D point cloud classification with unsupervised feature extraction. We improve the PointHop method furthermore in two aspects: 1) reducing its model complexity in terms of the model parameter number and 2) ordering discriminant features automatically based on the cross-entropy criterion. With experiments conducted on the ModelNet40 benchmark dataset, we show that the PointHop++ method performs on par with deep neural network (DNN) solutions and surpasses other unsupervised feature extraction methods.
arXiv Detail & Related papers (2020-02-09T04:49:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.