Related papers: WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts

WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts

URL: http://arxiv.org/abs/2402.18956v2
Date: Thu, 11 Apr 2024 10:06:10 GMT
Title: WWW: A Unified Framework for Explaining What, Where and Why of Neural Networks by Interpretation of Neuron Concepts
Authors: Yong Hyun Ahn, Hyeon Bae Kim, Seong Tae Kim,
Abstract summary: We propose a novel framework, WWW, that offers the 'what', 'where', and 'why' of the neural network decisions in human-understandable terms. WWW utilizes adaptive selection for concept discovery, employing adaptive cosine similarity and thresholding techniques. WWW provides a unified solution for explaining 'what', 'where', and 'why', introducing a method for localized explanations from global interpretations.
Score: 3.2627279988912194
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Recent advancements in neural networks have showcased their remarkable capabilities across various domains. Despite these successes, the "black box" problem still remains. Addressing this, we propose a novel framework, WWW, that offers the 'what', 'where', and 'why' of the neural network decisions in human-understandable terms. Specifically, WWW utilizes adaptive selection for concept discovery, employing adaptive cosine similarity and thresholding techniques to effectively explain 'what'. To address the 'where' and 'why', we proposed a novel combination of neuron activation maps (NAMs) with Shapley values, generating localized concept maps and heatmaps for individual inputs. Furthermore, WWW introduces a method for predicting uncertainty, leveraging heatmap similarities to estimate 'how' reliable the prediction is. Experimental evaluations of WWW demonstrate superior performance in both quantitative and qualitative metrics, outperforming existing methods in interpretability. WWW provides a unified solution for explaining 'what', 'where', and 'why', introducing a method for localized explanations from global interpretations and offering a plug-and-play solution adaptable to various architectures.

Related papers

Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z)
Show and Tell: Visually Explainable Deep Neural Nets via Spatially-Aware Concept Bottleneck Models [5.985204759362746]
We present a unified framework for transforming any vision neural network into a spatially and conceptually interpretable model. We name this method "Spatially-Aware and Label-Free Concept Bottleneck Model" (SALF-CBM)
arXiv Detail & Related papers (2025-02-27T14:27:55Z)
Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery [52.498055901649025]
Concept Bottleneck Models (CBMs) have been proposed to address the 'black-box' problem of deep neural networks. We propose a novel CBM approach -- called Discover-then-Name-CBM (DN-CBM) -- that inverts the typical paradigm. Our concept extraction strategy is efficient, since it is agnostic to the downstream task, and uses concepts already known to the model.
arXiv Detail & Related papers (2024-07-19T17:50:11Z)
Locally Testing Model Detections for Semantic Global Concepts [3.112979958793927]
We propose a framework for linking global concept encodings to the local processing of single network inputs. Our approach has the advantage of fully covering the model-internal encoding of the semantic concept. The results show major differences in the local perception and usage of individual global concept encodings.
arXiv Detail & Related papers (2024-05-27T12:52:45Z)
Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process. We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z)
Mapping Knowledge Representations to Concepts: A Review and New Perspectives [0.6875312133832078]
This review focuses on research that aims to associate internal representations with human understandable concepts. We find this taxonomy and theories of causality, useful for understanding what can be expected, and not expected, from neural network explanations. The analysis additionally uncovers an ambiguity in the reviewed literature related to the goal of model explainability.
arXiv Detail & Related papers (2022-12-31T12:56:12Z)
Explaining Deep Convolutional Neural Networks for Image Classification by Evolving Local Interpretable Model-agnostic Explanations [7.474973880539888]
The proposed method is model-agnostic, i.e., it can be utilised to explain any deep convolutional neural network models. The evolved local explanations on four images, randomly selected from ImageNet, are presented. The proposed method can obtain local explanations within one minute, which is more than ten times faster than LIME.
arXiv Detail & Related papers (2022-11-28T08:56:00Z)
Bayesian Learning for Neural Networks: an algorithmic survey [95.42181254494287]
This self-contained survey engages and introduces readers to the principles and algorithms of Bayesian Learning for Neural Networks. It provides an introduction to the topic from an accessible, practical-algorithmic perspective.
arXiv Detail & Related papers (2022-11-21T21:36:58Z)
Shap-CAM: Visual Explanations for Convolutional Neural Networks based on Shapley Value [86.69600830581912]
We develop a novel visual explanation method called Shap-CAM based on class activation mapping. We demonstrate that Shap-CAM achieves better visual performance and fairness for interpreting the decision making process.
arXiv Detail & Related papers (2022-08-07T00:59:23Z)
Explaining Deep Neural Networks for Point Clouds using Gradient-based Visualisations [1.2891210250935146]
We propose a novel approach to generate coarse visual explanations of networks designed to classify unstructured 3D data. Our method uses gradients flowing back to the final feature map layers and maps these values as contributions of the corresponding points in the input point cloud. The generality of our approach is tested on various point cloud classification networks, including'single object' networks PointNet, PointNet++, DGCNN, and a'scene' network VoteNet.
arXiv Detail & Related papers (2022-07-26T15:42:08Z)
Neural Networks with Recurrent Generative Feedback [61.90658210112138]
We instantiate this design on convolutional neural networks (CNNs) In the experiments, CNN-F shows considerably improved adversarial robustness over conventional feedforward CNNs on standard benchmarks.
arXiv Detail & Related papers (2020-07-17T19:32:48Z)
How Much Can I Trust You? -- Quantifying Uncertainties in Explaining Neural Networks [19.648814035399013]
Explainable AI (XAI) aims to provide interpretations for predictions made by learning machines, such as deep neural networks. We propose a new framework that allows to convert any arbitrary explanation method for neural networks into an explanation method for Bayesian neural networks. We demonstrate the effectiveness and usefulness of our approach extensively in various experiments.
arXiv Detail & Related papers (2020-06-16T08:54:42Z)
Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision. Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.