Neural Network based on Automatic Differentiation Transformation of
Numeric Iterate-to-Fixedpoint
- URL: http://arxiv.org/abs/2111.00326v1
- Date: Sat, 30 Oct 2021 20:34:21 GMT
- Title: Neural Network based on Automatic Differentiation Transformation of
Numeric Iterate-to-Fixedpoint
- Authors: Mansura Habiba, Barak A. Pearlmutter
- Abstract summary: This work proposes a Neural Network model that can control its depth using an iterate-to-fixed-point operator.
In contrast to the existing skip-connection concept, this proposed technique enables information to flow up and down in the network.
We evaluate models that use this novel mechanism on different long-term dependency tasks.
- Score: 1.1897857181479061
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work proposes a Neural Network model that can control its depth using an
iterate-to-fixed-point operator. The architecture starts with a standard
layered Network but with added connections from current later to earlier
layers, along with a gate to make them inactive under most circumstances. These
``temporal wormhole'' connections create a shortcut that allows the Neural
Network to use the information available at deeper layers and re-do earlier
computations with modulated inputs. End-to-end training is accomplished by
using appropriate calculations for a numeric iterate-to-fixed-point operator.
In a typical case, where the ``wormhole'' connections are inactive, this is
inexpensive; but when they are active, the network takes a longer time to
settle down, and the gradient calculation is also more laborious, with an
effect similar to making the network deeper. In contrast to the existing
skip-connection concept, this proposed technique enables information to flow up
and down in the network. Furthermore, the flow of information follows a fashion
that seems analogous to the afferent and efferent flow of information through
layers of processing in the brain. We evaluate models that use this novel
mechanism on different long-term dependency tasks. The results are competitive
with other studies, showing that the proposed model contributes significantly
to overcoming traditional deep learning models' vanishing gradient descent
problem. At the same time, the training time is significantly reduced, as the
``easy'' input cases are processed more quickly than ``difficult'' ones.
Related papers
- Stitching for Neuroevolution: Recombining Deep Neural Networks without Breaking Them [0.0]
Traditional approaches to neuroevolution often start from scratch.
Recombining trained networks is non-trivial because architectures and feature representations typically differ.
We employ stitching, which merges the networks by introducing new layers at crossover points.
arXiv Detail & Related papers (2024-03-21T08:30:44Z) - Incrementally-Computable Neural Networks: Efficient Inference for
Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs.
We take an incremental computing approach, looking to reuse calculations as the inputs change.
We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z) - How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series.
We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Rewiring Networks for Graph Neural Network Training Using Discrete
Geometry [0.0]
Information over-squashing is a problem that significantly impacts the training of graph neural networks (GNNs)
In this paper, we investigate the use of discrete analogues of classical geometric notions of curvature to model information flow on networks and rewire them.
We show that these classical notions achieve state-of-the-art performance in GNN training accuracy on a variety of real-world network datasets.
arXiv Detail & Related papers (2022-07-16T21:50:39Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Implicit recurrent networks: A novel approach to stationary input
processing with recurrent neural networks in deep learning [0.0]
In this work, we introduce and test a novel implementation of recurrent neural networks into deep learning.
We provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks.
A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task.
arXiv Detail & Related papers (2020-10-20T18:55:32Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Progressive Tandem Learning for Pattern Recognition with Deep Spiking
Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency.
We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z) - Prior knowledge distillation based on financial time series [0.8756822885568589]
We propose to use neural networks to represent indicators and train a large network constructed of smaller networks as feature layers.
In numerical experiments, we find that our algorithm is faster and more accurate than traditional methods on real financial datasets.
arXiv Detail & Related papers (2020-06-16T15:26:06Z) - Depth Enables Long-Term Memory for Recurrent Neural Networks [0.0]
We introduce a measure of the network's ability to support information flow across time, referred to as the Start-End separation rank.
We prove that deep recurrent networks support Start-End separation ranks which are higher than those supported by their shallow counterparts.
arXiv Detail & Related papers (2020-03-23T10:29:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.