Layerwise Knowledge Extraction from Deep Convolutional Networks
- URL: http://arxiv.org/abs/2003.09000v1
- Date: Thu, 19 Mar 2020 19:46:45 GMT
- Title: Layerwise Knowledge Extraction from Deep Convolutional Networks
- Authors: Simon Odense and Artur d'Avila Garcez
- Abstract summary: We propose a novel layerwise knowledge extraction method using M-of-N rules.
We show that this approach produces rules close to an optimal complexity-error tradeoff.
We also find that the softmax layer in Convolutional Neural Networks and Autoencoders is highly explainable by rule extraction.
- Score: 0.9137554315375922
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Knowledge extraction is used to convert neural networks into symbolic
descriptions with the objective of producing more comprehensible learning
models. The central challenge is to find an explanation which is more
comprehensible than the original model while still representing that model
faithfully. The distributed nature of deep networks has led many to believe
that the hidden features of a neural network cannot be explained by logical
descriptions simple enough to be comprehensible. In this paper, we propose a
novel layerwise knowledge extraction method using M-of-N rules which seeks to
obtain the best trade-off between the complexity and accuracy of rules
describing the hidden features of a deep network. We show empirically that this
approach produces rules close to an optimal complexity-error tradeoff. We apply
this method to a variety of deep networks and find that in the internal layers
we often cannot find rules with a satisfactory complexity and accuracy,
suggesting that rule extraction as a general purpose method for explaining the
internal logic of a neural network may be impossible. However, we also find
that the softmax layer in Convolutional Neural Networks and Autoencoders using
either tanh or relu activation functions is highly explainable by rule
extraction, with compact rules consisting of as little as 3 units out of 128
often reaching over 99% accuracy. This shows that rule extraction can be a
useful component for explaining parts (or modules) of a deep neural network.
Related papers
- DCNFIS: Deep Convolutional Neuro-Fuzzy Inference System [1.8802008255570541]
We report on the design of a new deep network that achieves improved transparency without sacrificing accuracy.
We design a deep convolutional neuro-fuzzy inference system (DCNFIS) by hybridizing fuzzy logic and deep learning models.
We exploit the transparency of fuzzy logic by deriving explanations, in the form of saliency maps, from the fuzzy rules encoded in the network.
arXiv Detail & Related papers (2023-08-11T20:32:39Z) - Rank Diminishing in Deep Neural Networks [71.03777954670323]
Rank of neural networks measures information flowing across layers.
It is an instance of a key structural condition that applies across broad domains of machine learning.
For neural networks, however, the intrinsic mechanism that yields low-rank structures remains vague and unclear.
arXiv Detail & Related papers (2022-06-13T12:03:32Z) - Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function.
We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z) - The Principles of Deep Learning Theory [19.33681537640272]
This book develops an effective theory approach to understanding deep neural networks of practical relevance.
We explain how these effectively-deep networks learn nontrivial representations from training.
We show that the depth-to-width ratio governs the effective model complexity of the ensemble of trained networks.
arXiv Detail & Related papers (2021-06-18T15:00:00Z) - Learning Structures for Deep Neural Networks [99.8331363309895]
We propose to adopt the efficient coding principle, rooted in information theory and developed in computational neuroscience.
We show that sparse coding can effectively maximize the entropy of the output signals.
Our experiments on a public image classification dataset demonstrate that using the structure learned from scratch by our proposed algorithm, one can achieve a classification accuracy comparable to the best expert-designed structure.
arXiv Detail & Related papers (2021-05-27T12:27:24Z) - Leveraging Sparse Linear Layers for Debuggable Deep Networks [86.94586860037049]
We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks.
The resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.
arXiv Detail & Related papers (2021-05-11T08:15:25Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Rule Extraction from Binary Neural Networks with Convolutional Rules for
Model Validation [16.956140135868733]
We introduce the concept of first-order convolutional rules, which are logical rules that can be extracted using a convolutional neural network (CNN)
Our approach is based on rule extraction from binary neural networks with local search.
Our experiments show that the proposed approach is able to model the functionality of the neural network while at the same time producing interpretable logical rules.
arXiv Detail & Related papers (2020-12-15T17:55:53Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z) - One-vs-Rest Network-based Deep Probability Model for Open Set
Recognition [6.85316573653194]
An intelligent self-learning system should be able to differentiate between known and unknown examples.
One-vs-rest networks can provide more informative hidden representations for unknown examples than the commonly used SoftMax layer.
The proposed probability model outperformed the state-of-the art methods in open set classification scenarios.
arXiv Detail & Related papers (2020-04-17T05:24:34Z) - Approximation smooth and sparse functions by deep neural networks
without saturation [0.6396288020763143]
In this paper, we aim at constructing deep neural networks with three hidden layers to approximate smooth and sparse functions.
We prove that the constructed deep nets can reach the optimal approximation rate in approximating both smooth and sparse functions with controllable magnitude of free parameters.
arXiv Detail & Related papers (2020-01-13T09:28:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.