Memory Capacity of Recurrent Neural Networks with Matrix Representation
- URL: http://arxiv.org/abs/2104.07454v3
- Date: Thu, 5 Oct 2023 03:47:41 GMT
- Title: Memory Capacity of Recurrent Neural Networks with Matrix Representation
- Authors: Animesh Renanse, Alok Sharma, Rohitash Chandra
- Abstract summary: We study a probabilistic notion of memory capacity based on Fisher information for matrix-based neural networks.
We show and analyze the increase in memory capacity for such networks which is introduced when one exhibits an external state memory.
We find an improvement in the performance of Matrix NTMs by the addition of external memory.
- Score: 1.0978496459260902
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is well known that canonical recurrent neural networks (RNNs) face
limitations in learning long-term dependencies which have been addressed by
memory structures in long short-term memory (LSTM) networks. Neural Turing
machines (NTMs) are novel RNNs that implement the notion of programmable
computers with neural network controllers that can learn simple algorithmic
tasks. Matrix neural networks feature matrix representation which inherently
preserves the spatial structure of data when compared to canonical neural
networks that use vector-based representation. One may then argue that neural
networks with matrix representations may have the potential to provide better
memory capacity. In this paper, we define and study a probabilistic notion of
memory capacity based on Fisher information for matrix-based RNNs. We find
bounds on memory capacity for such networks under various hypotheses and
compare them with their vector counterparts. In particular, we show that the
memory capacity of such networks is bounded by $N^2$ for $N\times N$ state
matrix which generalizes the one known for vector networks. We also show and
analyze the increase in memory capacity for such networks which is introduced
when one exhibits an external state memory, such as NTMs. Consequently, we
construct NTMs with RNN controllers with matrix-based representation of
external memory, leading us to introduce Matrix NTMs. We demonstrate the
performance of this class of memory networks under certain algorithmic learning
tasks such as copying and recall and compare it with Matrix RNNs. We find an
improvement in the performance of Matrix NTMs by the addition of external
memory, in comparison to Matrix RNNs.
Related papers
- Memory-Efficient Reversible Spiking Neural Networks [8.05761813203348]
Spiking neural networks (SNNs) are potential competitors to artificial neural networks (ANNs)
SNNs require much more memory than ANNs, which impedes the training of deeper SNN models.
We propose the reversible spiking neural network to reduce the memory cost of intermediate activations and membrane potentials during training.
arXiv Detail & Related papers (2023-12-13T06:39:49Z) - Heterogenous Memory Augmented Neural Networks [84.29338268789684]
We introduce a novel heterogeneous memory augmentation approach for neural networks.
By introducing learnable memory tokens with attention mechanism, we can effectively boost performance without huge computational overhead.
We show our approach on various image and graph-based tasks under both in-distribution (ID) and out-of-distribution (OOD) conditions.
arXiv Detail & Related papers (2023-10-17T01:05:28Z) - NeuralMatrix: Compute the Entire Neural Networks with Linear Matrix Operations for Efficient Inference [20.404864470321897]
We introduce NeuralMatrix, which elastically transforms the computations of entire deep neural network (DNN) models into linear matrix operations.
Experiments with both CNN and transformer-based models demonstrate the potential of NeuralMatrix to accurately and efficiently execute a wide range of DNN models.
This level of efficiency is usually only attainable with the accelerator designed for a specific neural network.
arXiv Detail & Related papers (2023-05-23T12:03:51Z) - From Tensor Network Quantum States to Tensorial Recurrent Neural
Networks [0.0]
We show that any matrix product state (MPS) can be exactly represented by a recurrent neural network (RNN) with a linear memory update.
We generalize this RNN architecture to 2D lattices using a multilinear memory update.
arXiv Detail & Related papers (2022-06-24T16:25:36Z) - Variable Bitrate Neural Fields [75.24672452527795]
We present a dictionary method for compressing feature grids, reducing their memory consumption by up to 100x.
We formulate the dictionary optimization as a vector-quantized auto-decoder problem which lets us learn end-to-end discrete neural representations in a space where no direct supervision is available.
arXiv Detail & Related papers (2022-06-15T17:58:34Z) - CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded
Systems [0.0]
A Convolutional Neural Network (CNN) is a class of Deep Neural Network (DNN) widely used in the analysis of visual images captured by an image sensor.
In this paper, we propose a neoteric variant of deep convolutional neural network architecture to ameliorate the performance of existing CNN architectures for real-time inference on embedded systems.
arXiv Detail & Related papers (2021-12-01T18:20:52Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Connecting Weighted Automata, Tensor Networks and Recurrent Neural
Networks through Spectral Learning [58.14930566993063]
We present connections between three models used in different research fields: weighted finite automata(WFA) from formal languages and linguistics, recurrent neural networks used in machine learning, and tensor networks.
We introduce the first provable learning algorithm for linear 2-RNN defined over sequences of continuous vectors input.
arXiv Detail & Related papers (2020-10-19T15:28:00Z) - Text Classification based on Multi-granularity Attention Hybrid Neural
Network [4.718408602093766]
We propose a hybrid architecture based on a novel hierarchical multi-granularity attention mechanism, named Multi-granularity Attention-based Hybrid Neural Network (MahNN)
The attention mechanism is to assign different weights to different parts of the input sequence to increase the computation efficiency and performance of neural models.
arXiv Detail & Related papers (2020-08-12T13:02:48Z) - Tensor train decompositions on recurrent networks [60.334946204107446]
Matrix product state (MPS) tensor trains have more attractive features than MPOs, in terms of storage reduction and computing time at inference.
We show that MPS tensor trains should be at the forefront of LSTM network compression through a theoretical analysis and practical experiments on NLP task.
arXiv Detail & Related papers (2020-06-09T18:25:39Z) - Binarized Graph Neural Network [65.20589262811677]
We develop a binarized graph neural network to learn the binary representations of the nodes with binary network parameters.
Our proposed method can be seamlessly integrated into the existing GNN-based embedding approaches.
Experiments indicate that the proposed binarized graph neural network, namely BGN, is orders of magnitude more efficient in terms of both time and space.
arXiv Detail & Related papers (2020-04-19T09:43:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.