CGXplain: Rule-Based Deep Neural Network Explanations Using Dual Linear
Programs
- URL: http://arxiv.org/abs/2304.05207v1
- Date: Tue, 11 Apr 2023 13:16:26 GMT
- Title: CGXplain: Rule-Based Deep Neural Network Explanations Using Dual Linear
Programs
- Authors: Konstantin Hemker, Zohreh Shams, Mateja Jamnik
- Abstract summary: Rule-based surrogate models are an effective way to approximate a Deep Neural Network's (DNN) decision boundaries.
This paper introduces the CGX (Column Generation eXplainer) to address these limitations.
- Score: 4.632241550169363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Rule-based surrogate models are an effective and interpretable way to
approximate a Deep Neural Network's (DNN) decision boundaries, allowing humans
to easily understand deep learning models. Current state-of-the-art
decompositional methods, which are those that consider the DNN's latent space
to extract more exact rule sets, manage to derive rule sets at high accuracy.
However, they a) do not guarantee that the surrogate model has learned from the
same variables as the DNN (alignment), b) only allow to optimise for a single
objective, such as accuracy, which can result in excessively large rule sets
(complexity), and c) use decision tree algorithms as intermediate models, which
can result in different explanations for the same DNN (stability). This paper
introduces the CGX (Column Generation eXplainer) to address these limitations -
a decompositional method using dual linear programming to extract rules from
the hidden representations of the DNN. This approach allows to optimise for any
number of objectives and empowers users to tweak the explanation model to their
needs. We evaluate our results on a wide variety of tasks and show that CGX
meets all three criteria, by having exact reproducibility of the explanation
model that guarantees stability and reduces the rule set size by >80%
(complexity) at equivalent or improved accuracy and fidelity across tasks
(alignment).
Related papers
- Efficient Link Prediction via GNN Layers Induced by Negative Sampling [92.05291395292537]
Graph neural networks (GNNs) for link prediction can loosely be divided into two broad categories.
First, emphnode-wise architectures pre-compute individual embeddings for each node that are later combined by a simple decoder to make predictions.
Second, emphedge-wise methods rely on the formation of edge-specific subgraph embeddings to enrich the representation of pair-wise relationships.
arXiv Detail & Related papers (2023-10-14T07:02:54Z) - A multiobjective continuation method to compute the regularization path of deep neural networks [1.3654846342364308]
Sparsity is a highly feature in deep neural networks (DNNs) since it ensures numerical efficiency, improves the interpretability of models, and robustness.
We present an algorithm that allows for the entire sparse front for the above-mentioned objectives in a very efficient manner for high-dimensional gradients with millions of parameters.
We demonstrate that knowledge of the regularization path allows for a well-generalizing network parametrization.
arXiv Detail & Related papers (2023-08-23T10:08:52Z) - Deep Model Reassembly [60.6531819328247]
We explore a novel knowledge-transfer task, termed as Deep Model Reassembly (DeRy)
The goal of DeRy is to first dissect each model into distinctive building blocks, and then selectively reassemble the derived blocks to produce customized networks.
We demonstrate that on ImageNet, the best reassemble model achieves 78.6% top-1 accuracy without fine-tuning.
arXiv Detail & Related papers (2022-10-24T10:16:13Z) - RNN Transducers for Nested Named Entity Recognition with constraints on
alignment for long sequences [4.545971444299925]
We introduce a new model for NER tasks -- an transducer (RNN-T)
RNN-T models learn the alignment using a loss function that sums over all alignments.
In NER tasks, however, the alignment between words and target labels are available from annotations.
We demonstrate that our fixed alignment model outperforms the standard RNN-score model.
arXiv Detail & Related papers (2022-02-08T05:38:20Z) - Efficient Decompositional Rule Extraction for Deep Neural Networks [5.69361786082969]
ECLAIRE is a novel-time rule extraction algorithm capable of scaling to both large DNN architectures and large training datasets.
We show that ECLAIRE consistently extracts more accurate and comprehensible rule sets than the current state-of-the-art methods.
arXiv Detail & Related papers (2021-11-24T16:54:10Z) - Scalable Rule-Based Representation Learning for Interpretable
Classification [12.736847587988853]
Rule-based Learner Representation (RRL) learns interpretable non-fuzzy rules for data representation and classification.
RRL can be easily adjusted to obtain a trade-off between classification accuracy and model complexity for different scenarios.
arXiv Detail & Related papers (2021-09-30T13:07:42Z) - Modeling the Second Player in Distributionally Robust Optimization [90.25995710696425]
We argue for the use of neural generative models to characterize the worst-case distribution.
This approach poses a number of implementation and optimization challenges.
We find that the proposed approach yields models that are more robust than comparable baselines.
arXiv Detail & Related papers (2021-03-18T14:26:26Z) - Better Short than Greedy: Interpretable Models through Optimal Rule
Boosting [10.938624307941197]
Rule ensembles are designed to provide a useful trade-off between predictive accuracy and model interpretability.
We present a novel approach aiming to fit rule ensembles of maximal predictive power for a given ensemble size.
arXiv Detail & Related papers (2021-01-21T01:03:48Z) - Chance-Constrained Control with Lexicographic Deep Reinforcement
Learning [77.34726150561087]
This paper proposes a lexicographic Deep Reinforcement Learning (DeepRL)-based approach to chance-constrained Markov Decision Processes.
A lexicographic version of the well-known DeepRL algorithm DQN is also proposed and validated via simulations.
arXiv Detail & Related papers (2020-10-19T13:09:14Z) - Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN)
Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one.
We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z) - Multi-Objective Matrix Normalization for Fine-grained Visual Recognition [153.49014114484424]
Bilinear pooling achieves great success in fine-grained visual recognition (FGVC)
Recent methods have shown that the matrix power normalization can stabilize the second-order information in bilinear features.
We propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation.
arXiv Detail & Related papers (2020-03-30T08:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.