Interpreting Deep Neural Networks with Relative Sectional Propagation by
Analyzing Comparative Gradients and Hostile Activations
- URL: http://arxiv.org/abs/2012.03434v2
- Date: Sat, 12 Dec 2020 10:49:00 GMT
- Title: Interpreting Deep Neural Networks with Relative Sectional Propagation by
Analyzing Comparative Gradients and Hostile Activations
- Authors: Woo-Jeoung Nam, Jaesik Choi, Seong-Whan Lee
- Abstract summary: We propose a new attribution method, Relative Sectional Propagation (RSP), for decomposing the output predictions of Deep Neural Networks (DNNs)
We define hostile factor as an element that interferes with finding the attributions of the target and propagates it in a distinguishable way to overcome the non-suppressed nature of activated neurons.
Our method makes it possible to decompose the predictions of DNNs with clearer class-discriminativeness and detailed elucidations of activation neurons compared to the conventional attribution methods.
- Score: 37.11665902583138
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The clear transparency of Deep Neural Networks (DNNs) is hampered by complex
internal structures and nonlinear transformations along deep hierarchies. In
this paper, we propose a new attribution method, Relative Sectional Propagation
(RSP), for fully decomposing the output predictions with the characteristics of
class-discriminative attributions and clear objectness. We carefully revisit
some shortcomings of backpropagation-based attribution methods, which are
trade-off relations in decomposing DNNs. We define hostile factor as an element
that interferes with finding the attributions of the target and propagate it in
a distinguishable way to overcome the non-suppressed nature of activated
neurons. As a result, it is possible to assign the bi-polar relevance scores of
the target (positive) and hostile (negative) attributions while maintaining
each attribution aligned with the importance. We also present the purging
techniques to prevent the decrement of the gap between the relevance scores of
the target and hostile attributions during backward propagation by eliminating
the conflicting units to channel attribution map. Therefore, our method makes
it possible to decompose the predictions of DNNs with clearer
class-discriminativeness and detailed elucidations of activation neurons
compared to the conventional attribution methods. In a verified experimental
environment, we report the results of the assessments: (i) Pointing Game, (ii)
mIoU, and (iii) Model Sensitivity with PASCAL VOC 2007, MS COCO 2014, and
ImageNet datasets. The results demonstrate that our method outperforms existing
backward decomposition methods, including distinctive and intuitive
visualizations.
Related papers
- Learning local discrete features in explainable-by-design convolutional neural networks [0.0]
We introduce an explainable-by-design convolutional neural network (CNN) based on the lateral inhibition mechanism.
The model consists of the predictor, that is a high-accuracy CNN with residual or dense skip connections.
By collecting observations and directly calculating probabilities, we can explain causal relationships between motifs of adjacent levels.
arXiv Detail & Related papers (2024-10-31T18:39:41Z) - Neural Networks Decoded: Targeted and Robust Analysis of Neural Network Decisions via Causal Explanations and Reasoning [9.947555560412397]
We introduce TRACER, a novel method grounded in causal inference theory to estimate the causal dynamics underpinning DNN decisions.
Our approach systematically intervenes on input features to observe how specific changes propagate through the network, affecting internal activations and final outputs.
TRACER further enhances explainability by generating counterfactuals that reveal possible model biases and offer contrastive explanations for misclassifications.
arXiv Detail & Related papers (2024-10-07T20:44:53Z) - IENE: Identifying and Extrapolating the Node Environment for Out-of-Distribution Generalization on Graphs [10.087216264788097]
We propose IENE, an OOD generalization method on graphs based on node-level environmental identification and extrapolation techniques.
It strengthens the model's ability to extract invariance from two granularities simultaneously, leading to improved generalization.
arXiv Detail & Related papers (2024-06-02T14:43:56Z) - Respect the model: Fine-grained and Robust Explanation with Sharing
Ratio Decomposition [29.491712784788188]
We propose a novel eXplainable AI (XAI) method called SRD (Sharing Ratio Decomposition), which sincerely reflects the model's inference process.
We also introduce an interesting observation termed Activation-Pattern-Only Prediction (APOP), letting us emphasize the importance of inactive neurons.
arXiv Detail & Related papers (2024-01-25T07:20:23Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Interpolation-based Correlation Reduction Network for Semi-Supervised
Graph Learning [49.94816548023729]
We propose a novel graph contrastive learning method, termed Interpolation-based Correlation Reduction Network (ICRN)
In our method, we improve the discriminative capability of the latent feature by enlarging the margin of decision boundaries.
By combining the two settings, we extract rich supervision information from both the abundant unlabeled nodes and the rare yet valuable labeled nodes for discnative representation learning.
arXiv Detail & Related papers (2022-06-06T14:26:34Z) - Illuminating Salient Contributions in Neuron Activation with Attribution Equilibrium [33.55397868171977]
We introduce Attribution Equilibrium, a novel method to decompose output predictions into fine-grained attributions.
We analyze conventional approaches to decision explanation and present a different perspective on the conservation of evidence.
arXiv Detail & Related papers (2022-05-23T07:57:42Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - And/or trade-off in artificial neurons: impact on adversarial robustness [91.3755431537592]
Presence of sufficient number of OR-like neurons in a network can lead to classification brittleness and increased vulnerability to adversarial attacks.
We define AND-like neurons and propose measures to increase their proportion in the network.
Experimental results on the MNIST dataset suggest that our approach holds promise as a direction for further exploration.
arXiv Detail & Related papers (2021-02-15T08:19:05Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.