Boundary Attributions Provide Normal (Vector) Explanations
- URL: http://arxiv.org/abs/2103.11257v2
- Date: Tue, 23 Mar 2021 16:06:51 GMT
- Title: Boundary Attributions Provide Normal (Vector) Explanations
- Authors: Zifan Wang, Matt Fredrikson, Anupam Datta
- Abstract summary: Boundary Attribution (BA) is a new explanation method to address this question.
BA involves computing normal vectors of the local decision boundaries for the target input.
We prove two theorems for ReLU networks: BA of randomized smoothed networks or robustly trained networks is much closer to non-boundary attribution methods than that in standard networks.
- Score: 27.20904776964045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work on explaining Deep Neural Networks (DNNs) focuses on attributing
the model's output scores to input features. However, when it comes to
classification problems, a more fundamental question is how much does each
feature contributes to the model's decision to classify an input instance into
a specific class. Our first contribution is Boundary Attribution, a new
explanation method to address this question. BA leverages an understanding of
the geometry of activation regions. Specifically, they involve computing (and
aggregating) normal vectors of the local decision boundaries for the target
input. Our second contribution is a set of analytical results connecting the
adversarial robustness of the network and the quality of gradient-based
explanations. Specifically, we prove two theorems for ReLU networks: BA of
randomized smoothed networks or robustly trained networks is much closer to
non-boundary attribution methods than that in standard networks. These
analytics encourage users to improve model robustness for high-quality
explanations. Finally, we evaluate the proposed methods on ImageNet and show
BAs produce more concentrated and sharper visualizations compared with
non-boundary ones. We further demonstrate that our method also helps to reduce
the sensitivity of attributions to the baseline input if one is required.
Related papers
- Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - On Discprecncies between Perturbation Evaluations of Graph Neural
Network Attributions [49.8110352174327]
We assess attribution methods from a perspective not previously explored in the graph domain: retraining.
The core idea is to retrain the network on important (or not important) relationships as identified by the attributions.
We run our analysis on four state-of-the-art GNN attribution methods and five synthetic and real-world graph classification datasets.
arXiv Detail & Related papers (2024-01-01T02:03:35Z) - Towards Rigorous Understanding of Neural Networks via
Semantics-preserving Transformations [0.0]
We present an approach to the precise and global verification and explanation of Rectifier Neural Networks.
Key to our approach is the symbolic execution of these networks that allows the construction of semantically equivalent Typed Affine Decision Structures.
arXiv Detail & Related papers (2023-01-19T11:35:07Z) - Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces [14.70409833767752]
Explainable AI aims to overcome the black-box nature of complex ML models like neural networks by generating explanations for their predictions.
We propose two new analyses, extending principles found in PCA or ICA to explanations.
These novel analyses, which we call principal relevant component analysis (PRCA) and disentangled relevant subspace analysis (DRSA), maximize relevance instead of e.g. variance or kurtosis.
arXiv Detail & Related papers (2022-12-30T18:04:25Z) - Finite Sample Identification of Wide Shallow Neural Networks with Biases [12.622813055808411]
The identification of the parameters of the network from finite samples of input-output pairs is often referred to as the emphteacher-student model
This paper fills the gap by providing constructive methods and theoretical guarantees of finite sample identification for such wider shallow networks with biases.
arXiv Detail & Related papers (2022-11-08T22:10:32Z) - Reinforced Causal Explainer for Graph Neural Networks [112.57265240212001]
Explainability is crucial for probing graph neural networks (GNNs)
We propose a reinforcement learning agent, Reinforced Causal Explainer (RC-Explainer)
RC-Explainer generates faithful and concise explanations, and has a better power to unseen graphs.
arXiv Detail & Related papers (2022-04-23T09:13:25Z) - ADVISE: ADaptive Feature Relevance and VISual Explanations for
Convolutional Neural Networks [0.745554610293091]
We introduce ADVISE, a new explainability method that quantifies and leverages the relevance of each unit of the feature map to provide better visual explanations.
We extensively evaluate our idea in the image classification task using AlexNet, VGG16, ResNet50, and Xception pretrained on ImageNet.
Our experiments further show that ADVISE fulfils the sensitivity and implementation independence axioms while passing the sanity checks.
arXiv Detail & Related papers (2022-03-02T18:16:57Z) - Self-Ensembling GAN for Cross-Domain Semantic Segmentation [107.27377745720243]
This paper proposes a self-ensembling generative adversarial network (SE-GAN) exploiting cross-domain data for semantic segmentation.
In SE-GAN, a teacher network and a student network constitute a self-ensembling model for generating semantic segmentation maps, which together with a discriminator, forms a GAN.
Despite its simplicity, we find SE-GAN can significantly boost the performance of adversarial training and enhance the stability of the model.
arXiv Detail & Related papers (2021-12-15T09:50:25Z) - Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks [83.58049517083138]
We consider a two-layer ReLU network trained via gradient descent.
We show that SGD is biased towards a simple solution.
We also provide empirical evidence that knots at locations distinct from the data points might occur.
arXiv Detail & Related papers (2021-11-03T15:14:20Z) - Cascade Network with Guided Loss and Hybrid Attention for Finding Good
Correspondences [33.65360396430535]
Given a putative correspondence set of an image pair, we propose a neural network which finds correct correspondences by a binary-class classifier.
We propose a new Guided Loss that can directly use evaluation criterion (Fn-measure) as guidance to dynamically adjust the objective function.
We then propose a hybrid attention block to extract feature, which integrates the Bayesian context normalization (BACN) and channel-wise attention (CA)
arXiv Detail & Related papers (2021-01-31T08:33:20Z) - Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision.
Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.