Visualizing Neural Network Imagination
- URL: http://arxiv.org/abs/2405.06409v1
- Date: Fri, 10 May 2024 11:43:35 GMT
- Title: Visualizing Neural Network Imagination
- Authors: Nevan Wichers, Victor Tao, Riccardo Volpato, Fazl Barez,
- Abstract summary: In certain situations, neural networks will represent environment states in their hidden activations.
Our goal is to visualize what environment states the networks are representing.
We define a quantitative interpretability metric and use it to demonstrate that hidden states can be highly interpretable.
- Score: 2.1749194587826026
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In certain situations, neural networks will represent environment states in their hidden activations. Our goal is to visualize what environment states the networks are representing. We experiment with a recurrent neural network (RNN) architecture with a decoder network at the end. After training, we apply the decoder to the intermediate representations of the network to visualize what they represent. We define a quantitative interpretability metric and use it to demonstrate that hidden states can be highly interpretable on a simple task. We also develop autoencoder and adversarial techniques and show that benefit interpretability.
Related papers
- Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks.
We show that the networks acquire strong, data-dependent features.
Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z) - Saliency Suppressed, Semantics Surfaced: Visual Transformations in Neural Networks and the Brain [0.0]
We take inspiration from neuroscience to shed light on how neural networks encode information at low (visual saliency) and high (semantic similarity) levels of abstraction.
We find that ResNets are more sensitive to saliency information than ViTs, when trained with object classification objectives.
We show that semantic encoding is a key factor in aligning AI with human visual perception, while saliency suppression is a non-brain-like strategy.
arXiv Detail & Related papers (2024-04-29T15:05:42Z) - Seeing in Words: Learning to Classify through Language Bottlenecks [59.97827889540685]
Humans can explain their predictions using succinct and intuitive descriptions.
We show that a vision model whose feature representations are text can effectively classify ImageNet images.
arXiv Detail & Related papers (2023-06-29T00:24:42Z) - Don't trust your eyes: on the (un)reliability of feature visualizations [25.018840023636546]
We show how to trick feature visualizations into showing arbitrary patterns that are completely disconnected from normal network behavior on natural input.
We then provide evidence for a similar phenomenon occurring in standard, unmanipulated networks.
This can be used as a sanity check for feature visualizations.
arXiv Detail & Related papers (2023-06-07T18:31:39Z) - Dynamic Inference with Neural Interpreters [72.90231306252007]
We present Neural Interpreters, an architecture that factorizes inference in a self-attention network as a system of modules.
inputs to the model are routed through a sequence of functions in a way that is end-to-end learned.
We show that Neural Interpreters perform on par with the vision transformer using fewer parameters, while being transferrable to a new task in a sample efficient manner.
arXiv Detail & Related papers (2021-10-12T23:22:45Z) - Discovering "Semantics" in Super-Resolution Networks [54.45509260681529]
Super-resolution (SR) is a fundamental and representative task of low-level vision area.
It is generally thought that the features extracted from the SR network have no specific semantic information.
Can we find any "semantics" in SR networks?
arXiv Detail & Related papers (2021-08-01T09:12:44Z) - Controlled Caption Generation for Images Through Adversarial Attacks [85.66266989600572]
We study adversarial examples for vision and language models, which typically adopt a Convolutional Neural Network (i.e., CNN) for image feature extraction and a Recurrent Neural Network (RNN) for caption generation.
In particular, we investigate attacks on the visual encoder's hidden layer that is fed to the subsequent recurrent network.
We propose a GAN-based algorithm for crafting adversarial examples for neural image captioning that mimics the internal representation of the CNN.
arXiv Detail & Related papers (2021-07-07T07:22:41Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z) - Visual Pattern Recognition with on On-chip Learning: towards a Fully
Neuromorphic Approach [10.181725314550823]
We present a spiking neural network (SNN) for visual pattern recognition with on-chip learning on neuromorphic hardware.
We show how this network can learn simple visual patterns composed of horizontal and vertical bars sensed by a Dynamic Vision Sensor.
During recognition, the network classifies the pattern's identity while at the same time estimating its location and scale.
arXiv Detail & Related papers (2020-08-08T08:07:36Z) - The Representation Theory of Neural Networks [7.724617675868718]
We show that neural networks can be represented via the mathematical theory of quiver representations.
We show that network quivers gently adapt to common neural network concepts.
We also provide a quiver representation model to understand how a neural network creates representations from the data.
arXiv Detail & Related papers (2020-07-23T19:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.