LAP: An Attention-Based Module for Concept Based Self-Interpretation and
Knowledge Injection in Convolutional Neural Networks
- URL: http://arxiv.org/abs/2201.11808v5
- Date: Tue, 24 Oct 2023 09:42:53 GMT
- Title: LAP: An Attention-Based Module for Concept Based Self-Interpretation and
Knowledge Injection in Convolutional Neural Networks
- Authors: Rassa Ghavami Modegh, Ahmad Salimi, Alireza Dizaji, Hamid R. Rabiee
- Abstract summary: We propose a new attention-based pooling layer, called Local Attention Pooling (LAP), that accomplishes self-interpretability.
LAP is easily pluggable into any convolutional neural network, even the already trained ones.
LAP offers more valid human-understandable and faithful-to-the-model interpretations than the commonly used white-box explainer methods.
- Score: 2.8948274245812327
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Despite the state-of-the-art performance of deep convolutional neural
networks, they are susceptible to bias and malfunction in unseen situations.
Moreover, the complex computation behind their reasoning is not
human-understandable to develop trust. External explainer methods have tried to
interpret network decisions in a human-understandable way, but they are accused
of fallacies due to their assumptions and simplifications. On the other side,
the inherent self-interpretability of models, while being more robust to the
mentioned fallacies, cannot be applied to the already trained models. In this
work, we propose a new attention-based pooling layer, called Local Attention
Pooling (LAP), that accomplishes self-interpretability and the possibility for
knowledge injection without performance loss. The module is easily pluggable
into any convolutional neural network, even the already trained ones. We have
defined a weakly supervised training scheme to learn the distinguishing
features in decision-making without depending on experts' annotations. We
verified our claims by evaluating several LAP-extended models on two datasets,
including ImageNet. The proposed framework offers more valid
human-understandable and faithful-to-the-model interpretations than the
commonly used white-box explainer methods.
Related papers
- Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks.
The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation.
We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z) - Does Saliency-Based Training bring Robustness for Deep Neural Networks
in Image Classification? [0.0]
Black-box nature of Deep Neural Networks impedes a complete understanding of their inner workings.
Online saliency-guided training methods try to highlight the prominent features in the model's output to alleviate this problem.
We quantify the robustness and conclude that despite the well-explained visualizations in the model's output, the salient models suffer from the lower performance against adversarial examples attacks.
arXiv Detail & Related papers (2023-06-28T22:20:19Z) - Explaining Explainability: Towards Deeper Actionable Insights into Deep
Learning through Second-order Explainability [70.60433013657693]
Second-order explainable AI (SOXAI) was recently proposed to extend explainable AI (XAI) from the instance level to the dataset level.
We demonstrate for the first time, via example classification and segmentation cases, that eliminating irrelevant concepts from the training set based on actionable insights from SOXAI can enhance a model's performance.
arXiv Detail & Related papers (2023-06-14T23:24:01Z) - Interpretable part-whole hierarchies and conceptual-semantic
relationships in neural networks [4.153804257347222]
We present Agglomerator, a framework capable of providing a representation of part-whole hierarchies from visual cues.
We evaluate our method on common datasets, such as SmallNORB, MNIST, FashionMNIST, CIFAR-10, and CIFAR-100.
arXiv Detail & Related papers (2022-03-07T10:56:13Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - It's FLAN time! Summing feature-wise latent representations for
interpretability [0.0]
We propose a novel class of structurally-constrained neural networks, which we call FLANs (Feature-wise Latent Additive Networks)
FLANs process each input feature separately, computing for each of them a representation in a common latent space.
These feature-wise latent representations are then simply summed, and the aggregated representation is used for prediction.
arXiv Detail & Related papers (2021-06-18T12:19:33Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z) - Explain by Evidence: An Explainable Memory-based Neural Network for
Question Answering [41.73026155036886]
This paper proposes an explainable, evidence-based memory network architecture.
It learns to summarize the dataset and extract supporting evidences to make its decision.
Our model achieves state-of-the-art performance on two popular question answering datasets.
arXiv Detail & Related papers (2020-11-05T21:18:21Z) - Making Neural Networks Interpretable with Attribution: Application to
Implicit Signals Prediction [11.427019313283997]
We propose a novel formulation of interpretable deep neural networks for the attribution task.
Using masked weights, hidden features can be deeply attributed, split into several input-restricted sub-networks and trained as a boosted mixture of experts.
arXiv Detail & Related papers (2020-08-26T06:46:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.