Improved Robustness and Hyperparameter Selection in Modern Hopfield Networks
- URL: http://arxiv.org/abs/2407.08742v1
- Date: Wed, 29 May 2024 01:23:19 GMT
- Title: Improved Robustness and Hyperparameter Selection in Modern Hopfield Networks
- Authors: Hayden McAlister, Anthony Robins, Lech Szymanski,
- Abstract summary: The modern Hopfield network generalizes the classical Hopfield network by allowing for sharper interaction functions.
The implementation of the network relies on applying large exponents to the dot product of memory vectors and probe vectors.
We describe this problem in detail, modify the original network description to mitigate the problem, and show the modification will not alter the networks' dynamics during update or training.
- Score: 1.2289361708127877
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The modern Hopfield network generalizes the classical Hopfield network by allowing for sharper interaction functions. This increases the capacity of the network as an autoassociative memory as nearby learned attractors will not interfere with one another. However, the implementation of the network relies on applying large exponents to the dot product of memory vectors and probe vectors. If the dimension of the data is large the calculation can be very large and result in problems when using floating point numbers in a practical implementation. We describe this problem in detail, modify the original network description to mitigate the problem, and show the modification will not alter the networks' dynamics during update or training. We also show our modification greatly improves hyperparameter selection for the modern Hopfield network, removing the dependence on the interaction vertex and resulting in an optimal region of hyperparameters that does not significantly change with the interaction vertex as it does in the original network.
Related papers
- Non-Separable Multi-Dimensional Network Flows for Visual Computing [62.50191141358778]
We propose a novel formalism for non-separable multi-dimensional network flows.
Since the flow is defined on a per-dimension basis, the maximizing flow automatically chooses the best matching feature dimensions.
As a proof of concept, we apply our formalism to the multi-object tracking problem and demonstrate that our approach outperforms scalar formulations on the MOT16 benchmark in terms of robustness to noise.
arXiv Detail & Related papers (2023-05-15T13:21:44Z) - Simplicial Hopfield networks [0.0]
We extend Hopfield networks by adding setwise connections and embedding these connections in a simplicial complex.
We show that our simplicial Hopfield networks increase memory storage capacity.
We also test analogous modern continuous Hopfield networks, offering a potentially promising avenue for improving the attention mechanism in Transformer models.
arXiv Detail & Related papers (2023-05-09T05:23:04Z) - Magnitude Invariant Parametrizations Improve Hypernetwork Learning [0.0]
Hypernetworks are powerful neural networks that predict the parameters of another neural network.
Training typically converges far more slowly than for non-hypernetwork models.
We identify a fundamental and previously unidentified problem that contributes to the challenge of training hypernetworks.
We present a simple solution to this problem using a revised hypernetwork formulation that we call Magnitude Invariant Parametrizations (MIP)
arXiv Detail & Related papers (2023-04-15T22:18:29Z) - Enhancing ResNet Image Classification Performance by using Parameterized
Hypercomplex Multiplication [1.370633147306388]
This paper studies ResNet architectures and incorporates parameterized hypercomplex multiplication into the backend of residual, quaternion, and vectormap convolutional neural networks to assess the effect.
We show that PHM does improve classification accuracy performance on several image datasets, including small, low-resolution CIFAR 10/100 and large high-resolution ImageNet and ASL, and can achieve state-of-the-art accuracy for hypercomplex networks.
arXiv Detail & Related papers (2023-01-11T18:24:07Z) - Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision.
A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive.
We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z) - MicroNet: Improving Image Recognition with Extremely Low FLOPs [82.54764264255505]
We find two factors, sparse connectivity and dynamic activation function, are effective to improve the accuracy.
We present a new dynamic activation function, named Dynamic Shift Max, to improve the non-linearity.
We arrive at a family of networks, named MicroNet, that achieves significant performance gains over the state of the art in the low FLOP regime.
arXiv Detail & Related papers (2021-08-12T17:59:41Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Rethinking Skip Connection with Layer Normalization in Transformers and
ResNets [49.87919454950763]
Skip connection is a widely-used technique to improve the performance of deep neural networks.
In this work, we investigate how the scale factors in the effectiveness of the skip connection.
arXiv Detail & Related papers (2021-05-15T11:44:49Z) - Dynamic Graph: Learning Instance-aware Connectivity for Neural Networks [78.65792427542672]
Dynamic Graph Network (DG-Net) is a complete directed acyclic graph, where the nodes represent convolutional blocks and the edges represent connection paths.
Instead of using the same path of the network, DG-Net aggregates features dynamically in each node, which allows the network to have more representation ability.
arXiv Detail & Related papers (2020-10-02T16:50:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.