Related papers: Learning in Feedforward Neural Networks Accelerated by Transfer Entropy

Learning in Feedforward Neural Networks Accelerated by Transfer Entropy

URL: http://arxiv.org/abs/2104.14616v1
Date: Thu, 29 Apr 2021 19:07:07 GMT
Title: Learning in Feedforward Neural Networks Accelerated by Transfer Entropy
Authors: Adrian Moldovan and Angel Ca\c{t}aron and R\u{a}zvan Andonie
Abstract summary: The transfer entropy (TE) was initially introduced as an information transfer measure used to quantify the statistical coherence between events (time series) Our contribution is an information-theoretical method for analyzing information transfer between the nodes of feedforward neural networks. We introduce a backpropagation type training algorithm that uses TE feedback connections to improve its performance.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current neural networks architectures are many times harder to train because of the increasing size and complexity of the used datasets. Our objective is to design more efficient training algorithms utilizing causal relationships inferred from neural networks. The transfer entropy (TE) was initially introduced as an information transfer measure used to quantify the statistical coherence between events (time series). Later, it was related to causality, even if they are not the same. There are only few papers reporting applications of causality or TE in neural networks. Our contribution is an information-theoretical method for analyzing information transfer between the nodes of feedforward neural networks. The information transfer is measured by the TE of feedback neural connections. Intuitively, TE measures the relevance of a connection in the network and the feedback amplifies this connection. We introduce a backpropagation type training algorithm that uses TE feedback connections to improve its performance.

Related papers

Opening the Black Box: predicting the trainability of deep neural networks with reconstruction entropy [0.0]
We present a method for predicting the trainable regime in parameter space for deep feedforward neural networks. For both the MNIST and CIFAR10 datasets, we show that a single epoch of training is sufficient to predict the trainability of the deep feedforward network.
arXiv Detail & Related papers (2024-06-13T18:00:05Z)
Learning in Convolutional Neural Networks Accelerated by Transfer Entropy [0.0]
In a feedforward network, the Transfer Entropy (TE) can be used to quantify the relationships between neuron output pairs located in different layers. We introduce a novel training mechanism for CNN architectures which integrates the TE feedback connections.
arXiv Detail & Related papers (2024-04-03T13:31:49Z)
Assessing Neural Network Representations During Training Using Noise-Resilient Diffusion Spectral Entropy [55.014926694758195]
Entropy and mutual information in neural networks provide rich information on the learning process. We leverage data geometry to access the underlying manifold and reliably compute these information-theoretic measures. We show that they form noise-resistant measures of intrinsic dimensionality and relationship strength in high-dimensional simulated data.
arXiv Detail & Related papers (2023-12-04T01:32:42Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels [6.144858413112823]
In deep learning, neural networks serve as noisy channels between input data and its representation. We study a frequently overlooked possibility that neural networks can be intrinsic toward optimal channels.
arXiv Detail & Related papers (2022-12-04T05:13:01Z)
Neural networks trained with SGD learn distributions of increasing complexity [78.30235086565388]
We show that neural networks trained using gradient descent initially classify their inputs using lower-order input statistics. We then exploit higher-order statistics only later during training. We discuss the relation of DSB to other simplicity biases and consider its implications for the principle of universality in learning.
arXiv Detail & Related papers (2022-11-21T15:27:22Z)
Interneurons accelerate learning dynamics in recurrent neural networks for statistical adaptation [39.245842636392865]
We study the benefits of mediating recurrent communication via interneurons compared with direct recurrent connections. Our results suggest interneurons are useful for rapid adaptation to changing input statistics.
arXiv Detail & Related papers (2022-09-21T20:03:58Z)
Rewiring Networks for Graph Neural Network Training Using Discrete Geometry [0.0]
Information over-squashing is a problem that significantly impacts the training of graph neural networks (GNNs) In this paper, we investigate the use of discrete analogues of classical geometric notions of curvature to model information flow on networks and rewire them. We show that these classical notions achieve state-of-the-art performance in GNN training accuracy on a variety of real-world network datasets.
arXiv Detail & Related papers (2022-07-16T21:50:39Z)
Decomposing neural networks as mappings of correlation functions [57.52754806616669]
We study the mapping between probability distributions implemented by a deep feed-forward network. We identify essential statistics in the data, as well as different information representations that can be used by neural networks.
arXiv Detail & Related papers (2022-02-10T09:30:31Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Implicit recurrent networks: A novel approach to stationary input processing with recurrent neural networks in deep learning [0.0]
In this work, we introduce and test a novel implementation of recurrent neural networks into deep learning. We provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks. A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task.
arXiv Detail & Related papers (2020-10-20T18:55:32Z)
Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis. By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner. This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.