Layer-Wise Interpretation of Deep Neural Networks Using Identity
Initialization
- URL: http://arxiv.org/abs/2102.13333v1
- Date: Fri, 26 Feb 2021 07:15:41 GMT
- Title: Layer-Wise Interpretation of Deep Neural Networks Using Identity
Initialization
- Authors: Shohei Kubota, Hideaki Hayashi, Tomohiro Hayase, Seiichi Uchida
- Abstract summary: In this paper, we propose an interpretation method for a deep multilayer perceptron.
The proposed method allows us to analyze the contribution of each neuron to classification and class likelihood in each hidden layer.
- Score: 3.708656266586146
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The interpretability of neural networks (NNs) is a challenging but essential
topic for transparency in the decision-making process using machine learning.
One of the reasons for the lack of interpretability is random weight
initialization, where the input is randomly embedded into a different feature
space in each layer. In this paper, we propose an interpretation method for a
deep multilayer perceptron, which is the most general architecture of NNs,
based on identity initialization (namely, initialization using identity
matrices). The proposed method allows us to analyze the contribution of each
neuron to classification and class likelihood in each hidden layer. As a
property of the identity-initialized perceptron, the weight matrices remain
near the identity matrices even after learning. This property enables us to
treat the change of features from the input to each hidden layer as the
contribution to classification. Furthermore, we can separate the output of each
hidden layer into a contribution map that depicts the contribution to
classification and class likelihood, by adding extra dimensions to each layer
according to the number of classes, thereby allowing the calculation of the
recognition accuracy in each layer and thus revealing the roles of independent
layers, such as feature extraction and classification.
Related papers
- Understanding Deep Representation Learning via Layerwise Feature
Compression and Discrimination [33.273226655730326]
We show that each layer of a deep linear network progressively compresses within-class features at a geometric rate and discriminates between-class features at a linear rate.
This is the first quantitative characterization of feature evolution in hierarchical representations of deep linear networks.
arXiv Detail & Related papers (2023-11-06T09:00:38Z) - Equivariant Architectures for Learning in Deep Weight Spaces [54.61765488960555]
We present a novel network architecture for learning in deep weight spaces.
It takes as input a concatenation of weights and biases of a pre-trainedvariant.
We show how these layers can be implemented using three basic operations.
arXiv Detail & Related papers (2023-01-30T10:50:33Z) - Interpreting intermediate convolutional layers in unsupervised acoustic
word classification [0.0]
This paper proposes a technique to visualize and interpret intermediate layers of unsupervised deep convolutional neural networks.
A GAN-based architecture (ciwGAN arXiv:2006.02951) was trained on unlabeled sliced lexical items from TIMIT.
arXiv Detail & Related papers (2021-10-05T21:53:32Z) - Auto-Parsing Network for Image Captioning and Visual Question Answering [101.77688388554097]
We propose an Auto-Parsing Network (APN) to discover and exploit the input data's hidden tree structures.
Specifically, we impose a Probabilistic Graphical Model (PGM) parameterized by the attention operations on each self-attention layer to incorporate sparse assumption.
arXiv Detail & Related papers (2021-08-24T08:14:35Z) - EigenGAN: Layer-Wise Eigen-Learning for GANs [84.33920839885619]
EigenGAN is able to unsupervisedly mine interpretable and controllable dimensions from different generator layers.
By traversing the coefficient of a specific eigen-dimension, the generator can produce samples with continuous changes corresponding to a specific semantic attribute.
arXiv Detail & Related papers (2021-04-26T11:14:37Z) - An evidential classifier based on Dempster-Shafer theory and deep
learning [6.230751621285322]
We propose a new classification system based on Dempster-Shafer (DS) theory and a convolutional neural network (CNN) architecture for set-valued classification.
Experiments on image recognition, signal processing, and semantic-relationship classification tasks demonstrate that the proposed combination of deep CNN, DS layer, and expected utility layer makes it possible to improve classification accuracy.
arXiv Detail & Related papers (2021-03-25T01:29:05Z) - Deep Learning with a Classifier System: Initial Results [0.0]
This article presents the first results from using a learning classifier system capable of performing adaptive computation with deep neural networks.
The system automatically reduces the number of weights and units while maintaining performance after achieving a maximum prediction error.
arXiv Detail & Related papers (2021-03-01T16:40:12Z) - Provably End-to-end Label-Noise Learning without Anchor Points [118.97592870124937]
We propose an end-to-end framework for solving label-noise learning without anchor points.
Our proposed framework can identify the transition matrix if the clean class-posterior probabilities are sufficiently scattered.
arXiv Detail & Related papers (2021-02-04T03:59:37Z) - Dual-constrained Deep Semi-Supervised Coupled Factorization Network with
Enriched Prior [80.5637175255349]
We propose a new enriched prior based Dual-constrained Deep Semi-Supervised Coupled Factorization Network, called DS2CF-Net.
To ex-tract hidden deep features, DS2CF-Net is modeled as a deep-structure and geometrical structure-constrained neural network.
Our network can obtain state-of-the-art performance for representation learning and clustering.
arXiv Detail & Related papers (2020-09-08T13:10:21Z) - Hierarchical nucleation in deep neural networks [67.85373725288136]
We study the evolution of the probability density of the ImageNet dataset across the hidden layers in some state-of-the-art DCNs.
We find that the initial layers generate a unimodal probability density getting rid of any structure irrelevant for classification.
In subsequent layers density peaks arise in a hierarchical fashion that mirrors the semantic hierarchy of the concepts.
arXiv Detail & Related papers (2020-07-07T14:42:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.