Related papers: Deep Networks as Logical Circuits: Generalization and Interpretation

Deep Networks as Logical Circuits: Generalization and Interpretation

URL: http://arxiv.org/abs/2003.11619v2
Date: Fri, 26 Jun 2020 15:29:28 GMT
Title: Deep Networks as Logical Circuits: Generalization and Interpretation
Authors: Christopher Snyder, Sriram Vishwanath
Abstract summary: We present a hierarchical decomposition of the Deep Neural Networks (DNNs) discrete classification map into logical (AND/OR) combinations of intermediate (True/False) classifiers of the input. We show that the learned, internal, logical computations correspond to semantically meaningful categories that allow DNN descriptions in plain English.
Score: 10.223907995092835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Not only are Deep Neural Networks (DNNs) black box models, but also we frequently conceptualize them as such. We lack good interpretations of the mechanisms linking inputs to outputs. Therefore, we find it difficult to analyze in human-meaningful terms (1) what the network learned and (2) whether the network learned. We present a hierarchical decomposition of the DNN discrete classification map into logical (AND/OR) combinations of intermediate (True/False) classifiers of the input. Those classifiers that can not be further decomposed, called atoms, are (interpretable) linear classifiers. Taken together, we obtain a logical circuit with linear classifier inputs that computes the same label as the DNN. This circuit does not structurally resemble the network architecture, and it may require many fewer parameters, depending on the configuration of weights. In these cases, we obtain simultaneously an interpretation and generalization bound (for the original DNN), connecting two fronts which have historically been investigated separately. Unlike compression techniques, our representation is. We motivate the utility of this perspective by studying DNNs in simple, controlled settings, where we obtain superior generalization bounds despite using only combinatorial information (e.g. no margin information). We demonstrate how to "open the black box" on the MNIST dataset. We show that the learned, internal, logical computations correspond to semantically meaningful (unlabeled) categories that allow DNN descriptions in plain English. We improve the generalization of an already trained network by interpreting, diagnosing, and replacing components the logical circuit that is the DNN.

Related papers

OMENN: One Matrix to Explain Neural Networks [2.397390211883228]
One Matrix to Explain Neural Networks (OMENN) is a novel post-hoc method that represents a neural network as a single, interpretable matrix for each specific input. We present a theoretical analysis of OMENN based on dynamic linearity property and validate its effectiveness with extensive tests on two XAI benchmarks.
arXiv Detail & Related papers (2024-12-03T11:49:01Z)
Logic interpretations of ANN partition cells [0.0]
Consider a binary classification problem solved using a feed-forward artificial neural network (ANN) Let the ANN be composed of a ReLU layer and several linear layers (convolution, sum-pooling, or fully connected) We construct a bridge between a simple ANN and logic. As a result, we can analyze and manipulate the semantics of an ANN using the powerful tool set of logic.
arXiv Detail & Related papers (2024-08-26T14:43:43Z)
What Can Be Learnt With Wide Convolutional Neural Networks? [69.55323565255631]
We study infinitely-wide deep CNNs in the kernel regime. We prove that deep CNNs adapt to the spatial scale of the target function. We conclude by computing the generalisation error of a deep CNN trained on the output of another deep CNN.
arXiv Detail & Related papers (2022-08-01T17:19:32Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set. We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps. Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z)
Disentangling deep neural networks with rectified linear units using duality [4.683806391173103]
We propose a novel interpretable counterpart of deep neural networks (DNNs) with rectified linear units (ReLUs) We show that convolution with global pooling and skip connection provide respectively rotational invariance and ensemble structure to the neural path kernel (NPK)
arXiv Detail & Related papers (2021-10-06T16:51:59Z)
Generalizing Neural Networks by Reflecting Deviating Data in Production [15.498447555957773]
We present a runtime approach that mitigates DNN mis-predictions caused by unexpected runtime inputs to the DNN. We use a distribution analyzer based on the distance metric learned by a Siamese network to identify "unseen" semantically-preserving inputs. Our approach transforms those unexpected inputs into inputs from the training set that are identified as having similar semantics.
arXiv Detail & Related papers (2021-10-06T13:05:45Z)
Rule Extraction from Binary Neural Networks with Convolutional Rules for Model Validation [16.956140135868733]
We introduce the concept of first-order convolutional rules, which are logical rules that can be extracted using a convolutional neural network (CNN) Our approach is based on rule extraction from binary neural networks with local search. Our experiments show that the proposed approach is able to model the functionality of the neural network while at the same time producing interpretable logical rules.
arXiv Detail & Related papers (2020-12-15T17:55:53Z)
A Temporal Neural Network Architecture for Online Learning [0.6091702876917281]
Temporal neural networks (TNNs) communicate and process information encoded as relative spike times. A TNN architecture is proposed and, as a proof-of-concept, TNN operation is demonstrated within the larger context of online supervised classification.
arXiv Detail & Related papers (2020-11-27T17:15:29Z)
Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models. We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges. We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z)
Architecture Disentanglement for Deep Neural Networks [174.16176919145377]
We introduce neural architecture disentanglement (NAD) to explain the inner workings of deep neural networks (DNNs) NAD learns to disentangle a pre-trained DNN into sub-architectures according to independent tasks, forming information flows that describe the inference processes. Results show that misclassified images have a high probability of being assigned to task sub-architectures similar to the correct ones.
arXiv Detail & Related papers (2020-03-30T08:34:33Z)
Approximation and Non-parametric Estimation of ResNet-type Convolutional Neural Networks [52.972605601174955]
We show a ResNet-type CNN can attain the minimax optimal error rates in important function classes. We derive approximation and estimation error rates of the aformentioned type of CNNs for the Barron and H"older classes.
arXiv Detail & Related papers (2019-03-24T19:42:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.