Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?
- URL: http://arxiv.org/abs/2502.04832v1
- Date: Fri, 07 Feb 2025 11:06:30 GMT
- Title: Memory Capacity of Nonlinear Recurrent Networks: Is it Informative?
- Authors: Giovanni Ballarin, Lyudmila Grigoryeva, Juan-Pablo Ortega,
- Abstract summary: The total memory capacity (MC) of linear recurrent neural networks (RNNs) has been proven to be equal to the rank of the corresponding Kalman controllability matrix.
This fact questions the usefulness of this metric in distinguishing the performance of linear RNNs in the processing of signals.
- Score: 5.03863830033243
- License:
- Abstract: The total memory capacity (MC) of linear recurrent neural networks (RNNs) has been proven to be equal to the rank of the corresponding Kalman controllability matrix, and it is almost surely maximal for connectivity and input weight matrices drawn from regular distributions. This fact questions the usefulness of this metric in distinguishing the performance of linear RNNs in the processing of stochastic signals. This note shows that the MC of random nonlinear RNNs yields arbitrary values within established upper and lower bounds depending just on the input process scale. This confirms that the existing definition of MC in linear and nonlinear cases has no practical value.
Related papers
- Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations [54.17275171325324]
We present a counterexample to the Linear Representation Hypothesis (LRH)
When trained to repeat an input token sequence, neural networks learn to represent the token at each position with a particular order of magnitude, rather than a direction.
These findings strongly indicate that interpretability research should not be confined to the LRH.
arXiv Detail & Related papers (2024-08-20T15:04:37Z) - Metric-Entropy Limits on Nonlinear Dynamical System Learning [4.069144210024563]
We show that recurrent neural networks (RNNs) are capable of learning nonlinear systems that satisfy a Lipschitz property and forget past inputs fast enough in a metric-entropy optimal manner.
As the sets of sequence-to-sequence maps we consider are significantly more massive than function classes generally considered in deep neural network approximation theory, a refined metric-entropy characterization is needed.
arXiv Detail & Related papers (2024-07-01T12:57:03Z) - Error Correction Capabilities of Non-Linear Cryptographic Hash Functions [56.368766255147555]
Linear hashes are known to possess error-correcting capabilities.
In most applications, non-linear hashes with pseudorandom outputs are utilized instead.
We show that non-linear hashes might also exhibit good error-correcting capabilities.
arXiv Detail & Related papers (2024-05-02T17:26:56Z) - Matrix Completion via Nonsmooth Regularization of Fully Connected Neural Networks [7.349727826230864]
It has been shown that enhanced performance could be attained by using nonlinear estimators such as deep neural networks.
In this paper, we control over-fitting by regularizing FCNN model in terms of norm intermediate representations.
Our simulations indicate the superiority of the proposed algorithm in comparison with existing linear and nonlinear algorithms.
arXiv Detail & Related papers (2024-03-15T12:00:37Z) - Inverse Approximation Theory for Nonlinear Recurrent Neural Networks [28.840757822712195]
We prove an inverse approximation theorem for the approximation of nonlinear sequence-to-sequence relationships using recurrent neural networks (RNNs)
We show that nonlinear sequence relationships that can be stably approximated by nonlinear RNNs must have an exponential decaying memory structure.
This extends the previously identified curse of memory in linear RNNs into the general nonlinear setting.
arXiv Detail & Related papers (2023-05-30T16:34:28Z) - Memory of recurrent networks: Do we compute it right? [5.03863830033243]
We study the case of linear echo state networks, for which the total memory capacity has been proven to be equal to the rank of the corresponding Kalman controllability matrix.
We show that these issues, often overlooked in the recent literature, are of an exclusively numerical nature.
arXiv Detail & Related papers (2023-05-02T14:37:52Z) - Lipschitz Continuity Retained Binary Neural Network [52.17734681659175]
We introduce the Lipschitz continuity as the rigorous criteria to define the model robustness for BNN.
We then propose to retain the Lipschitz continuity as a regularization term to improve the model robustness.
Our experiments prove that our BNN-specific regularization method can effectively strengthen the robustness of BNN.
arXiv Detail & Related papers (2022-07-13T22:55:04Z) - Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness [172.61581010141978]
Certifiable robustness is a desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios.
We propose a novel solution to strategically manipulate neurons, by "grafting" appropriate levels of linearity.
arXiv Detail & Related papers (2022-06-15T22:42:29Z) - Implicit Bias of Linear RNNs [27.41989861342218]
Linear recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory.
This paper provides a rigorous explanation of this property in the special case of linear RNNs.
Using recently-developed kernel regime analysis, our main result shows that linear RNNs are functionally equivalent to a certain weighted 1D-convolutional network.
arXiv Detail & Related papers (2021-01-19T19:39:28Z) - Nonlinear State-Space Generalizations of Graph Convolutional Neural
Networks [172.18295279061607]
Graph convolutional neural networks (GCNNs) learn compositional representations from network data by nesting linear graph convolutions into nonlinearities.
In this work, we approach GCNNs from a state-space perspective revealing that the graph convolutional module is a minimalistic linear state-space model.
We show that this state update may be problematic because it is nonparametric, and depending on the graph spectrum it may explode or vanish.
We propose a novel family of nodal aggregation rules that aggregate node features within a layer in a nonlinear state-space parametric fashion allowing for a better trade-off.
arXiv Detail & Related papers (2020-10-27T19:48:56Z) - Matrix Smoothing: A Regularization for DNN with Transition Matrix under
Noisy Labels [54.585681272543056]
Training deep neural networks (DNNs) in the presence of noisy labels is an important and challenging task.
Recent probabilistic methods directly apply transition matrix to DNN, neglect DNN's susceptibility to overfitting.
We propose a novel method, in which a smoothed transition matrix is used for updating DNN, to restrict the overfitting.
arXiv Detail & Related papers (2020-03-26T13:49:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.