Understanding CNN Hidden Neuron Activations Using Structured Background
Knowledge and Deductive Reasoning
- URL: http://arxiv.org/abs/2308.03999v2
- Date: Wed, 9 Aug 2023 15:59:50 GMT
- Title: Understanding CNN Hidden Neuron Activations Using Structured Background
Knowledge and Deductive Reasoning
- Authors: Abhilekha Dalal, Md Kamruzzaman Sarker, Adrita Barua, Eugene
Vasserman, Pascal Hitzler
- Abstract summary: State of the art indicates that hidden node activations can, in some cases, be interpretable in a way that makes sense to humans.
We show that we can automatically attach meaningful labels from the background knowledge to individual neurons in the dense layer of a Convolutional Neural Network.
- Score: 3.6223658572137825
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A major challenge in Explainable AI is in correctly interpreting activations
of hidden neurons: accurate interpretations would provide insights into the
question of what a deep learning system has internally detected as relevant on
the input, demystifying the otherwise black-box character of deep learning
systems. The state of the art indicates that hidden node activations can, in
some cases, be interpretable in a way that makes sense to humans, but
systematic automated methods that would be able to hypothesize and verify
interpretations of hidden neuron activations are underexplored. In this paper,
we provide such a method and demonstrate that it provides meaningful
interpretations. Our approach is based on using large-scale background
knowledge approximately 2 million classes curated from the Wikipedia concept
hierarchy together with a symbolic reasoning approach called Concept Induction
based on description logics, originally developed for applications in the
Semantic Web field. Our results show that we can automatically attach
meaningful labels from the background knowledge to individual neurons in the
dense layer of a Convolutional Neural Network through a hypothesis and
verification process.
Related papers
- On the Value of Labeled Data and Symbolic Methods for Hidden Neuron Activation Analysis [1.55858752644861]
State of the art indicates that hidden node activations can, in some cases, be interpretable in a way that makes sense to humans.
We introduce a novel model-agnostic post-hoc Explainable AI method demonstrating that it provides meaningful interpretations.
arXiv Detail & Related papers (2024-04-21T07:57:45Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Brain-Inspired Machine Intelligence: A Survey of
Neurobiologically-Plausible Credit Assignment [65.268245109828]
We examine algorithms for conducting credit assignment in artificial neural networks that are inspired or motivated by neurobiology.
We organize the ever-growing set of brain-inspired learning schemes into six general families and consider these in the context of backpropagation of errors.
The results of this review are meant to encourage future developments in neuro-mimetic systems and their constituent learning processes.
arXiv Detail & Related papers (2023-12-01T05:20:57Z) - From Neural Activations to Concepts: A Survey on Explaining Concepts in Neural Networks [15.837316393474403]
Concepts can act as a natural link between learning and reasoning.
Knowledge can not only be extracted from neural networks but concept knowledge can also be inserted into neural network architectures.
arXiv Detail & Related papers (2023-10-18T11:08:02Z) - Adversarial Attacks on the Interpretation of Neuron Activation
Maximization [70.5472799454224]
Activation-maximization approaches are used to interpret and analyze trained deep-learning models.
In this work, we consider the concept of an adversary manipulating a model for the purpose of deceiving the interpretation.
arXiv Detail & Related papers (2023-06-12T19:54:33Z) - Explaining Deep Learning Hidden Neuron Activations using Concept
Induction [3.6223658572137825]
State of the art indicates that hidden node activations appear to be interpretable in a way that makes sense to humans.
We show that we can automatically attach meaningful labels from the background knowledge to individual neurons in the dense layer of a Convolutional Neural Network.
arXiv Detail & Related papers (2023-01-23T18:14:32Z) - NeuroExplainer: Fine-Grained Attention Decoding to Uncover Cortical
Development Patterns of Preterm Infants [73.85768093666582]
We propose an explainable geometric deep network dubbed NeuroExplainer.
NeuroExplainer is used to uncover altered infant cortical development patterns associated with preterm birth.
arXiv Detail & Related papers (2023-01-01T12:48:12Z) - Mapping Knowledge Representations to Concepts: A Review and New
Perspectives [0.6875312133832078]
This review focuses on research that aims to associate internal representations with human understandable concepts.
We find this taxonomy and theories of causality, useful for understanding what can be expected, and not expected, from neural network explanations.
The analysis additionally uncovers an ambiguity in the reviewed literature related to the goal of model explainability.
arXiv Detail & Related papers (2022-12-31T12:56:12Z) - An Interpretable Neuron Embedding for Static Knowledge Distillation [7.644253344815002]
We propose a new interpretable neural network method, by embedding neurons into the semantic space.
The proposed semantic vector externalizes the latent knowledge to static knowledge, which is easy to exploit.
Empirical experiments of visualization show that semantic vectors describe neuron activation semantics well.
arXiv Detail & Related papers (2022-11-14T03:26:10Z) - Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction.
This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.