Hybrid CNN -Interpreter: Interpret local and global contexts for
CNN-based Models
- URL: http://arxiv.org/abs/2211.00185v1
- Date: Mon, 31 Oct 2022 22:59:33 GMT
- Title: Hybrid CNN -Interpreter: Interpret local and global contexts for
CNN-based Models
- Authors: Wenli Yang, Guan Huang, Renjie Li, Jiahao Yu, Yanyu Chen, Quan Bai,
Beyong Kang
- Abstract summary: Convolutional neural network (CNN) models have seen advanced improvements in performance in various domains.
Lack of interpretability is a major barrier to assurance and regulation during operation for acceptance and deployment of AI-assisted applications.
We propose a novel hybrid CNN-interpreter through:.
An original forward propagation mechanism to examine the layer-specific prediction results for local interpretability.
A new global interpretability that indicates the feature correlation and filter importance effects.
- Score: 9.148791330175191
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Convolutional neural network (CNN) models have seen advanced improvements in
performance in various domains, but lack of interpretability is a major barrier
to assurance and regulation during operation for acceptance and deployment of
AI-assisted applications. There have been many works on input interpretability
focusing on analyzing the input-output relations, but the internal logic of
models has not been clarified in the current mainstream interpretability
methods. In this study, we propose a novel hybrid CNN-interpreter through: (1)
An original forward propagation mechanism to examine the layer-specific
prediction results for local interpretability. (2) A new global
interpretability that indicates the feature correlation and filter importance
effects. By combining the local and global interpretabilities, hybrid
CNN-interpreter enables us to have a solid understanding and monitoring of
model context during the whole learning process with detailed and consistent
representations. Finally, the proposed interpretabilities have been
demonstrated to adapt to various CNN-based model structures.
Related papers
- Improving Neuron-level Interpretability with White-box Language Models [11.898535906016907]
We introduce a white-box transformer-like architecture named Coding RAte TransformEr (CRATE)
Our comprehensive experiments showcase significant improvements (up to 103% relative improvement) in neuron-level interpretability.
CRATE's increased interpretability comes from its enhanced ability to consistently and distinctively activate on relevant tokens.
arXiv Detail & Related papers (2024-10-21T19:12:33Z) - GINN-KAN: Interpretability pipelining with applications in Physics Informed Neural Networks [5.2969467015867915]
We introduce the concept of interpretability pipelineing, to incorporate multiple interpretability techniques to outperform each individual technique.
We evaluate two recent models selected for their potential to incorporate interpretability into standard neural network architectures.
We introduce a novel interpretable neural network GINN-KAN that synthesizes the advantages of both models.
arXiv Detail & Related papers (2024-08-27T04:57:53Z) - Improving Network Interpretability via Explanation Consistency Evaluation [56.14036428778861]
We propose a framework that acquires more explainable activation heatmaps and simultaneously increase the model performance.
Specifically, our framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning.
Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations.
arXiv Detail & Related papers (2024-08-08T17:20:08Z) - Explaining Text Similarity in Transformer Models [52.571158418102584]
Recent advances in explainable AI have made it possible to mitigate limitations by leveraging improved explanations for Transformers.
We use BiLRP, an extension developed for computing second-order explanations in bilinear similarity models, to investigate which feature interactions drive similarity in NLP models.
Our findings contribute to a deeper understanding of different semantic similarity tasks and models, highlighting how novel explainable AI methods enable in-depth analyses and corpus-level insights.
arXiv Detail & Related papers (2024-05-10T17:11:31Z) - InterpretCC: Intrinsic User-Centric Interpretability through Global Mixture of Experts [31.738009841932374]
Interpretability for neural networks is a trade-off between three key requirements.
We present InterpretCC, a family of interpretable-by-design neural networks that guarantee human-centric interpretability.
arXiv Detail & Related papers (2024-02-05T11:55:50Z) - E Pluribus Unum Interpretable Convolutional Neural Networks [6.45481313278967]
We develop a novel framework for instantiating inherently interpretable CNN models, named E Pluribus Unum Interpretable CNN (EPU-CNN)
An EPU-CNN model consists of CNN sub-networks, each of which receives a different representation of an input image expressing a perceptual feature, such as color or texture.
We show that EPU-CNN models can achieve a comparable or better classification performance than other CNN architectures while providing humanly perceivable interpretations.
arXiv Detail & Related papers (2022-08-10T14:37:03Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - i-Algebra: Towards Interactive Interpretability of Deep Neural Networks [41.13047686374529]
We present i-Algebra, a first-of-its-kind interactive framework for interpreting deep neural networks (DNNs)
At its core is a library of atomic, composable operators, which explain model behaviors at varying input granularity, during different inference stages, and from distinct interpretation perspectives.
We conduct user studies in a set of representative analysis tasks, including inspecting adversarial inputs, resolving model inconsistency, and cleansing contaminated data, all demonstrating its promising usability.
arXiv Detail & Related papers (2021-01-22T19:22:57Z) - Interpreting Graph Neural Networks for NLP With Differentiable Edge
Masking [63.49779304362376]
Graph neural networks (GNNs) have become a popular approach to integrating structural inductive biases into NLP models.
We introduce a post-hoc method for interpreting the predictions of GNNs which identifies unnecessary edges.
We show that we can drop a large proportion of edges without deteriorating the performance of the model.
arXiv Detail & Related papers (2020-10-01T17:51:19Z) - Neural Networks with Recurrent Generative Feedback [61.90658210112138]
We instantiate this design on convolutional neural networks (CNNs)
In the experiments, CNN-F shows considerably improved adversarial robustness over conventional feedforward CNNs on standard benchmarks.
arXiv Detail & Related papers (2020-07-17T19:32:48Z) - Transferable Perturbations of Deep Feature Distributions [102.94094966908916]
This work presents a new adversarial attack based on the modeling and exploitation of class-wise and layer-wise deep feature distributions.
We achieve state-of-the-art targeted blackbox transfer-based attack results for undefended ImageNet models.
arXiv Detail & Related papers (2020-04-27T00:32:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.