Credit Assignment for Trained Neural Networks Based on Koopman Operator
Theory
- URL: http://arxiv.org/abs/2212.00998v1
- Date: Fri, 2 Dec 2022 06:34:27 GMT
- Title: Credit Assignment for Trained Neural Networks Based on Koopman Operator
Theory
- Authors: Zhen Liang, Changyuan Zhao, Wanwei Liu, Bai Xue, Wenjing Yang and
Zhengbin Pang
- Abstract summary: Credit assignment problem of neural networks refers to evaluating the credit of each network component to the final outputs.
This paper presents an alternative perspective of linear dynamics on dealing with the credit assignment problem for trained neural networks.
Experiments conducted on typical neural networks demonstrate the effectiveness of the proposed method.
- Score: 3.130109807128472
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Credit assignment problem of neural networks refers to evaluating the credit
of each network component to the final outputs. For an untrained neural
network, approaches to tackling it have made great contributions to parameter
update and model revolution during the training phase. This problem on trained
neural networks receives rare attention, nevertheless, it plays an increasingly
important role in neural network patch, specification and verification. Based
on Koopman operator theory, this paper presents an alternative perspective of
linear dynamics on dealing with the credit assignment problem for trained
neural networks. Regarding a neural network as the composition of sub-dynamics
series, we utilize step-delay embedding to capture snapshots of each component,
characterizing the established mapping as exactly as possible. To circumvent
the dimension-difference problem encountered during the embedding, a
composition and decomposition of an auxiliary linear layer, termed minimal
linear dimension alignment, is carefully designed with rigorous formal
guarantee. Afterwards, each component is approximated by a Koopman operator and
we derive the Jacobian matrix and its corresponding determinant, similar to
backward propagation. Then, we can define a metric with algebraic
interpretability for the credit assignment of each network component. Moreover,
experiments conducted on typical neural networks demonstrate the effectiveness
of the proposed method.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Clustering-Based Interpretation of Deep ReLU Network [17.234442722611803]
We recognize that the non-linear behavior of the ReLU function gives rise to a natural clustering.
We propose a method to increase the level of interpretability of a fully connected feedforward ReLU neural network.
arXiv Detail & Related papers (2021-10-13T09:24:11Z) - Deep Learning Based Resource Assignment for Wireless Networks [25.138235752143586]
This paper presents a deep learning approach for binary assignment problems in wireless networks, which identifies binary variables for permutation matrices.
Numerical results demonstrate the effectiveness of the proposed method in various scenarios.
arXiv Detail & Related papers (2021-09-27T11:51:24Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - Fast Adaptation with Linearized Neural Networks [35.43406281230279]
We study the inductive biases of linearizations of neural networks, which we show to be surprisingly good summaries of the full network functions.
Inspired by this finding, we propose a technique for embedding these inductive biases into Gaussian processes through a kernel designed from the Jacobian of the network.
In this setting, domain adaptation takes the form of interpretable posterior inference, with accompanying uncertainty estimation.
arXiv Detail & Related papers (2021-03-02T03:23:03Z) - Compressive Sensing and Neural Networks from a Statistical Learning
Perspective [4.561032960211816]
We present a generalization error analysis for a class of neural networks suitable for sparse reconstruction from few linear measurements.
Under realistic conditions, the generalization error scales only logarithmically in the number of layers, and at most linear in number of measurements.
arXiv Detail & Related papers (2020-10-29T15:05:43Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - Investigating the Compositional Structure Of Deep Neural Networks [1.8899300124593645]
We introduce a novel theoretical framework based on the compositional structure of piecewise linear activation functions.
It is possible to characterize the instances of the input data with respect to both the predicted label and the specific (linear) transformation used to perform predictions.
Preliminary tests on the MNIST dataset show that our method can group input instances with regard to their similarity in the internal representation of the neural network.
arXiv Detail & Related papers (2020-02-17T14:16:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.