Implicit recurrent networks: A novel approach to stationary input
processing with recurrent neural networks in deep learning
- URL: http://arxiv.org/abs/2010.10564v1
- Date: Tue, 20 Oct 2020 18:55:32 GMT
- Title: Implicit recurrent networks: A novel approach to stationary input
processing with recurrent neural networks in deep learning
- Authors: Sebastian Sanokowski
- Abstract summary: In this work, we introduce and test a novel implementation of recurrent neural networks into deep learning.
We provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks.
A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The brain cortex, which processes visual, auditory and sensory data in the
brain, is known to have many recurrent connections within its layers and from
higher to lower layers. But, in the case of machine learning with neural
networks, it is generally assumed that strict feed-forward architectures are
suitable for static input data, such as images, whereas recurrent networks are
required mainly for the processing of sequential input, such as language.
However, it is not clear whether also processing of static input data benefits
from recurrent connectivity. In this work, we introduce and test a novel
implementation of recurrent neural networks with lateral and feed-back
connections into deep learning. This departure from the strict feed-forward
structure prevents the use of the standard error backpropagation algorithm for
training the networks. Therefore we provide an algorithm which implements the
backpropagation algorithm on a implicit implementation of recurrent networks,
which is different from state-of-the-art implementations of recurrent neural
networks. Our method, in contrast to current recurrent neural networks,
eliminates the use of long chains of derivatives due to many iterative update
steps, which makes learning computationally less costly. It turns out that the
presence of recurrent intra-layer connections within a one-layer implicit
recurrent network enhances the performance of neural networks considerably: A
single-layer implicit recurrent network is able to solve the XOR problem, while
a feed-forward network with monotonically increasing activation function fails
at this task. Finally, we demonstrate that a two-layer implicit recurrent
architecture leads to a better performance in a regression task of physical
parameters from the measured trajectory of a damped pendulum.
Related papers
- Gradient-free training of recurrent neural networks [3.272216546040443]
We introduce a computational approach to construct all weights and biases of a recurrent neural network without using gradient-based methods.
The approach is based on a combination of random feature networks and Koopman operator theory for dynamical systems.
In computational experiments on time series, forecasting for chaotic dynamical systems, and control problems, we observe that the training time and forecasting accuracy of the recurrent neural networks we construct are improved.
arXiv Detail & Related papers (2024-10-30T21:24:34Z) - Opening the Black Box: predicting the trainability of deep neural networks with reconstruction entropy [0.0]
We present a method for predicting the trainable regime in parameter space for deep feedforward neural networks.
For both the MNIST and CIFAR10 datasets, we show that a single epoch of training is sufficient to predict the trainability of the deep feedforward network.
arXiv Detail & Related papers (2024-06-13T18:00:05Z) - Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence.
We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers.
This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Global quantitative robustness of regression feed-forward neural
networks [0.0]
We adapt the notion of the regression breakdown point to regression neural networks.
We compare the performance, measured by the out-of-sample loss, by a proxy of the breakdown rate.
The results indeed motivate to use robust loss functions for neural network training.
arXiv Detail & Related papers (2022-11-18T09:57:53Z) - Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption.
They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware.
A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z) - Improving the Trainability of Deep Neural Networks through Layerwise
Batch-Entropy Regularization [1.3999481573773072]
We introduce and evaluate the batch-entropy which quantifies the flow of information through each layer of a neural network.
We show that we can train a "vanilla" fully connected network and convolutional neural network with 500 layers by simply adding the batch-entropy regularization term to the loss function.
arXiv Detail & Related papers (2022-08-01T20:31:58Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Verifying Recurrent Neural Networks using Invariant Inference [0.0]
We propose a novel approach for verifying properties of a widespread variant of neural networks, called recurrent neural networks.
Our approach is based on the inference of invariants, which allow us to reduce the complex problem of verifying recurrent networks into simpler, non-recurrent problems.
arXiv Detail & Related papers (2020-04-06T08:08:24Z) - ResiliNet: Failure-Resilient Inference in Distributed Neural Networks [56.255913459850674]
We introduce ResiliNet, a scheme for making inference in distributed neural networks resilient to physical node failures.
Failout simulates physical node failure conditions during training using dropout, and is specifically designed to improve the resiliency of distributed neural networks.
arXiv Detail & Related papers (2020-02-18T05:58:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.