Related papers: Decomposing neural networks as mappings of correlation functions

Decomposing neural networks as mappings of correlation functions

URL: http://arxiv.org/abs/2202.04925v1
Date: Thu, 10 Feb 2022 09:30:31 GMT
Title: Decomposing neural networks as mappings of correlation functions
Authors: Kirsten Fischer, Alexandre Ren\'e, Christian Keup, Moritz Layer, David Dahmen, Moritz Helias
Abstract summary: We study the mapping between probability distributions implemented by a deep feed-forward network. We identify essential statistics in the data, as well as different information representations that can be used by neural networks.
Score: 57.52754806616669
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Understanding the functional principles of information processing in deep neural networks continues to be a challenge, in particular for networks with trained and thus non-random weights. To address this issue, we study the mapping between probability distributions implemented by a deep feed-forward network. We characterize this mapping as an iterated transformation of distributions, where the non-linearity in each layer transfers information between different orders of correlation functions. This allows us to identify essential statistics in the data, as well as different information representations that can be used by neural networks. Applied to an XOR task and to MNIST, we show that correlations up to second order predominantly capture the information processing in the internal layers, while the input layer also extracts higher-order correlations from the data. This analysis provides a quantitative and explainable perspective on classification.

Related papers

Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Opening the Black Box: predicting the trainability of deep neural networks with reconstruction entropy [0.0]
We present a method for predicting the trainable regime in parameter space for deep feedforward neural networks. For both the MNIST and CIFAR10 datasets, we show that a single epoch of training is sufficient to predict the trainability of the deep feedforward network.
arXiv Detail & Related papers (2024-06-13T18:00:05Z)
Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs. By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z)
Inference Graphs for CNN Interpretation [12.765543440576144]
Convolutional neural networks (CNNs) have achieved superior accuracy in many visual related tasks. We propose to model the network hidden layers activity using probabilistic models. We show that such graphs are useful for understanding the general inference process of a class, as well as explaining decisions the network makes regarding specific images.
arXiv Detail & Related papers (2021-10-20T13:56:09Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Mutual Information Scaling for Tensor Network Machine Learning [0.0]
We show how a related correlation analysis can be applied to tensor network machine learning. We explore whether classical data possess correlation scaling patterns similar to those found in quantum states. We characterize the scaling patterns in the MNIST and Tiny Images datasets, and find clear evidence of boundary-law scaling in the latter.
arXiv Detail & Related papers (2021-02-27T02:17:51Z)
Interpretable Neural Networks based classifiers for categorical inputs [0.0]
We introduce a simple way to interpret the output function of a neural network classifier that take as input categorical variables. We show that in these cases each layer of the network, and the logits layer in particular, can be expanded as a sum of terms that account for the contribution to the classification of each input pattern. The analysis of the contributions of each pattern, after an appropriate gauge transformation, is presented in two cases where the effectiveness of the method can be appreciated.
arXiv Detail & Related papers (2021-02-05T14:38:50Z)
The distance between the weights of the neural network is meaningful [9.329400348695435]
In the application of neural networks, we need to select a suitable model based on the problem complexity and the dataset scale. This paper proves that the distance between the neural network weights in different training stages can be used to estimate the information accumulated by the network in the training process directly.
arXiv Detail & Related papers (2021-01-31T06:44:49Z)
Neural networks adapting to datasets: learning network size and topology [77.34726150561087]
We introduce a flexible setup allowing for a neural network to learn both its size and topology during the course of a gradient-based training. The resulting network has the structure of a graph tailored to the particular learning task and dataset.
arXiv Detail & Related papers (2020-06-22T12:46:44Z)
Forgetting Outside the Box: Scrubbing Deep Networks of Information Accessible from Input-Output Observations [143.3053365553897]
We describe a procedure for removing dependency on a cohort of training data from a trained deep network. We introduce a new bound on how much information can be extracted per query about the forgotten cohort. We exploit the connections between the activation and weight dynamics of a DNN inspired by Neural Tangent Kernels to compute the information in the activations.
arXiv Detail & Related papers (2020-03-05T23:17:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.