Neural Activation Patterns (NAPs): Visual Explainability of Learned
Concepts
- URL: http://arxiv.org/abs/2206.10611v1
- Date: Mon, 20 Jun 2022 09:05:57 GMT
- Title: Neural Activation Patterns (NAPs): Visual Explainability of Learned
Concepts
- Authors: Alex B\"auerle, Daniel J\"onsson, Timo Ropinski
- Abstract summary: We present a method that takes into account the entire activation distribution.
By extracting similar activation profiles within the high-dimensional activation space of a neural network layer, we find groups of inputs that are treated similarly.
These input groups represent neural activation patterns (NAPs) and can be used to visualize and interpret learned layer concepts.
- Score: 8.562628320010035
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A key to deciphering the inner workings of neural networks is understanding
what a model has learned. Promising methods for discovering learned features
are based on analyzing activation values, whereby current techniques focus on
analyzing high activation values to reveal interesting features on a neuron
level. However, analyzing high activation values limits layer-level concept
discovery. We present a method that instead takes into account the entire
activation distribution. By extracting similar activation profiles within the
high-dimensional activation space of a neural network layer, we find groups of
inputs that are treated similarly. These input groups represent neural
activation patterns (NAPs) and can be used to visualize and interpret learned
layer concepts. We release a framework with which NAPs can be extracted from
pre-trained models and provide a visual introspection tool that can be used to
analyze NAPs. We tested our method with a variety of networks and show how it
complements existing methods for analyzing neural network activation values.
Related papers
- Towards Utilising a Range of Neural Activations for Comprehending Representational Associations [0.6554326244334868]
We show that an approach to label intermediate representations in deep neural networks fails to capture valuable information about their behaviour.
We hypothesise that non-extremal level activations contain complex information worth investigating.
We use our findings to develop a method to curate data from mid-range logit samples for retraining to mitigate spurious correlations.
arXiv Detail & Related papers (2024-11-15T07:54:14Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Hebbian Learning based Orthogonal Projection for Continual Learning of
Spiking Neural Networks [74.3099028063756]
We develop a new method with neuronal operations based on lateral connections and Hebbian learning.
We show that Hebbian and anti-Hebbian learning on recurrent lateral connections can effectively extract the principal subspace of neural activities.
Our method consistently solves for spiking neural networks with nearly zero forgetting.
arXiv Detail & Related papers (2024-02-19T09:29:37Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - DISCOVER: Making Vision Networks Interpretable via Competition and
Dissection [11.028520416752325]
This work contributes to post-hoc interpretability, and specifically Network Dissection.
Our goal is to present a framework that makes it easier to discover the individual functionality of each neuron in a network trained on a vision task.
arXiv Detail & Related papers (2023-10-07T21:57:23Z) - Understanding Activation Patterns in Artificial Neural Networks by
Exploring Stochastic Processes [0.0]
We propose utilizing the framework of processes, which has been underutilized thus far.
We focus solely on activation frequency, leveraging neuroscience techniques used for real neuron spike trains.
We derive parameters describing activation patterns in each network, revealing consistent differences across architectures and training sets.
arXiv Detail & Related papers (2023-08-01T22:12:30Z) - Adversarial Attacks on the Interpretation of Neuron Activation
Maximization [70.5472799454224]
Activation-maximization approaches are used to interpret and analyze trained deep-learning models.
In this work, we consider the concept of an adversary manipulating a model for the purpose of deceiving the interpretation.
arXiv Detail & Related papers (2023-06-12T19:54:33Z) - A survey on recently proposed activation functions for Deep Learning [0.0]
This survey discusses the main concepts of activation functions in neural networks.
It includes a brief introduction to deep neural networks, a summary of what are activation functions and how they are used in neural networks, their most common properties, the different types of activation functions, some of the challenges, limitations, and alternative solutions faced by activation functions.
arXiv Detail & Related papers (2022-04-06T16:21:52Z) - How and what to learn:The modes of machine learning [7.085027463060304]
We propose a new approach, namely the weight pathway analysis (WPA), to study the mechanism of multilayer neural networks.
WPA shows that a neural network stores and utilizes information in a "holographic" way, that is, the network encodes all training samples in a coherent structure.
It is found that hidden-layer neurons self-organize into different classes in the later stages of the learning process.
arXiv Detail & Related papers (2022-02-28T14:39:06Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.