Deep Networks as Logical Circuits: Generalization and Interpretation
- URL: http://arxiv.org/abs/2003.11619v2
- Date: Fri, 26 Jun 2020 15:29:28 GMT
- Title: Deep Networks as Logical Circuits: Generalization and Interpretation
- Authors: Christopher Snyder, Sriram Vishwanath
- Abstract summary: We present a hierarchical decomposition of the Deep Neural Networks (DNNs) discrete classification map into logical (AND/OR) combinations of intermediate (True/False) classifiers of the input.
We show that the learned, internal, logical computations correspond to semantically meaningful categories that allow DNN descriptions in plain English.
- Score: 10.223907995092835
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Not only are Deep Neural Networks (DNNs) black box models, but also we
frequently conceptualize them as such. We lack good interpretations of the
mechanisms linking inputs to outputs. Therefore, we find it difficult to
analyze in human-meaningful terms (1) what the network learned and (2) whether
the network learned. We present a hierarchical decomposition of the DNN
discrete classification map into logical (AND/OR) combinations of intermediate
(True/False) classifiers of the input. Those classifiers that can not be
further decomposed, called atoms, are (interpretable) linear classifiers. Taken
together, we obtain a logical circuit with linear classifier inputs that
computes the same label as the DNN. This circuit does not structurally resemble
the network architecture, and it may require many fewer parameters, depending
on the configuration of weights. In these cases, we obtain simultaneously an
interpretation and generalization bound (for the original DNN), connecting two
fronts which have historically been investigated separately. Unlike compression
techniques, our representation is. We motivate the utility of this perspective
by studying DNNs in simple, controlled settings, where we obtain superior
generalization bounds despite using only combinatorial information (e.g. no
margin information). We demonstrate how to "open the black box" on the MNIST
dataset. We show that the learned, internal, logical computations correspond to
semantically meaningful (unlabeled) categories that allow DNN descriptions in
plain English. We improve the generalization of an already trained network by
interpreting, diagnosing, and replacing components the logical circuit that is
the DNN.
Related papers
- Logic interpretations of ANN partition cells [0.0]
Consider a binary classification problem solved using a feed-forward artificial neural network (ANN)
Let the ANN be composed of a ReLU layer and several linear layers (convolution, sum-pooling, or fully connected)
We construct a bridge between a simple ANN and logic. As a result, we can analyze and manipulate the semantics of an ANN using the powerful tool set of logic.
arXiv Detail & Related papers (2024-08-26T14:43:43Z) - What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime.
We prove that deep CNNs adapt to the spatial scale of the target function.
We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z) - Deep Architecture Connectivity Matters for Its Convergence: A
Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training.
We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z) - Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set.
We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps.
Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z) - Disentangling deep neural networks with rectified linear units using
duality [4.683806391173103]
We propose a novel interpretable counterpart of deep neural networks (DNNs) with rectified linear units (ReLUs)
We show that convolution with global pooling and skip connection provide respectively rotational invariance and ensemble structure to the neural path kernel (NPK)
arXiv Detail & Related papers (2021-10-06T16:51:59Z) - Generalizing Neural Networks by Reflecting Deviating Data in Production [15.498447555957773]
We present a runtime approach that mitigates DNN mis-predictions caused by unexpected runtime inputs to the DNN.
We use a distribution analyzer based on the distance metric learned by a Siamese network to identify "unseen" semantically-preserving inputs.
Our approach transforms those unexpected inputs into inputs from the training set that are identified as having similar semantics.
arXiv Detail & Related papers (2021-10-06T13:05:45Z) - Rule Extraction from Binary Neural Networks with Convolutional Rules for
Model Validation [16.956140135868733]
We introduce the concept of first-order convolutional rules, which are logical rules that can be extracted using a convolutional neural network (CNN)
Our approach is based on rule extraction from binary neural networks with local search.
Our experiments show that the proposed approach is able to model the functionality of the neural network while at the same time producing interpretable logical rules.
arXiv Detail & Related papers (2020-12-15T17:55:53Z) - A Temporal Neural Network Architecture for Online Learning [0.6091702876917281]
Temporal neural networks (TNNs) communicate and process information encoded as relative spike times.
A TNN architecture is proposed and, as a proof-of-concept, TNN operation is demonstrated within the larger context of online supervised classification.
arXiv Detail & Related papers (2020-11-27T17:15:29Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z) - Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs)
NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes.
Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z) - Approximation and Non-parametric Estimation of ResNet-type Convolutional
Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes.
We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.