Understanding the Role of Individual Units in a Deep Neural Network
- URL: http://arxiv.org/abs/2009.05041v2
- Date: Sat, 12 Sep 2020 18:58:32 GMT
- Title: Understanding the Role of Individual Units in a Deep Neural Network
- Authors: David Bau, Jun-Yan Zhu, Hendrik Strobelt, Agata Lapedriza, Bolei Zhou,
Antonio Torralba
- Abstract summary: We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
- Score: 85.23117441162772
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks excel at finding hierarchical representations that solve
complex tasks over large data sets. How can we humans understand these learned
representations? In this work, we present network dissection, an analytic
framework to systematically identify the semantics of individual hidden units
within image classification and image generation networks. First, we analyze a
convolutional neural network (CNN) trained on scene classification and discover
units that match a diverse set of object concepts. We find evidence that the
network has learned many object classes that play crucial roles in classifying
scene classes. Second, we use a similar analytic method to analyze a generative
adversarial network (GAN) model trained to generate scenes. By analyzing
changes made when small sets of units are activated or deactivated, we find
that objects can be added and removed from the output scenes while adapting to
the context. Finally, we apply our analytic framework to understanding
adversarial attacks and to semantic image editing.
Related papers
- Linking in Style: Understanding learned features in deep learning models [0.0]
Convolutional neural networks (CNNs) learn abstract features to perform object classification.
We propose an automatic method to visualize and systematically analyze learned features in CNNs.
arXiv Detail & Related papers (2024-09-25T12:28:48Z) - Learning Object-Centric Representation via Reverse Hierarchy Guidance [73.05170419085796]
Object-Centric Learning (OCL) seeks to enable Neural Networks to identify individual objects in visual scenes.
RHGNet introduces a top-down pathway that works in different ways in the training and inference processes.
Our model achieves SOTA performance on several commonly used datasets.
arXiv Detail & Related papers (2024-05-17T07:48:27Z) - Visualizing Neural Network Imagination [2.1749194587826026]
In certain situations, neural networks will represent environment states in their hidden activations.
Our goal is to visualize what environment states the networks are representing.
We define a quantitative interpretability metric and use it to demonstrate that hidden states can be highly interpretable.
arXiv Detail & Related papers (2024-05-10T11:43:35Z) - Image segmentation with traveling waves in an exactly solvable recurrent
neural network [71.74150501418039]
We show that a recurrent neural network can effectively divide an image into groups according to a scene's structural characteristics.
We present a precise description of the mechanism underlying object segmentation in this network.
We then demonstrate a simple algorithm for object segmentation that generalizes across inputs ranging from simple geometric objects in grayscale images to natural images.
arXiv Detail & Related papers (2023-11-28T16:46:44Z) - Semiotics Networks Representing Perceptual Inference [0.0]
We present a computational model designed to track and simulate the perception of objects.
Our model is not limited to persons and can be applied to any system featuring a loop involving the processing from "internal" to "external" representations.
arXiv Detail & Related papers (2023-10-08T16:05:17Z) - Learning and generalization of compositional representations of visual
scenes [2.960473840509733]
We use distributed representations of object attributes and vector operations in a vector symbolic architecture to create a full compositional description of a scene.
To control the scene composition, we use artificial images composed of multiple, translated and colored MNIST digits.
The output of the deep network can then be interpreted by a VSA resonator network, to extract object identity or other properties of indiviual objects.
arXiv Detail & Related papers (2023-03-23T22:03:42Z) - What do End-to-End Speech Models Learn about Speaker, Language and
Channel Information? A Layer-wise and Neuron-level Analysis [16.850888973106706]
We conduct a post-hoc functional interpretability analysis of pretrained speech models using the probing framework.
We analyze utterance-level representations of speech models trained for various tasks such as speaker recognition and dialect identification.
Our results reveal several novel findings, including: i) channel and gender information are distributed across the network, ii) the information is redundantly available in neurons with respect to a task, and iv) complex properties such as dialectal information are encoded only in the task-oriented pretrained network.
arXiv Detail & Related papers (2021-07-01T13:32:55Z) - Learning Physical Graph Representations from Visual Scenes [56.7938395379406]
Physical Scene Graphs (PSGs) represent scenes as hierarchical graphs with nodes corresponding intuitively to object parts at different scales, and edges to physical connections between parts.
PSGNet augments standard CNNs by including: recurrent feedback connections to combine low and high-level image information; graph pooling and vectorization operations that convert spatially-uniform feature maps into object-centric graph structures.
We show that PSGNet outperforms alternative self-supervised scene representation algorithms at scene segmentation tasks.
arXiv Detail & Related papers (2020-06-22T16:10:26Z) - Ventral-Dorsal Neural Networks: Object Detection via Selective Attention [51.79577908317031]
We propose a new framework called Ventral-Dorsal Networks (VDNets)
Inspired by the structure of the human visual system, we propose the integration of a "Ventral Network" and a "Dorsal Network"
Our experimental results reveal that the proposed method outperforms state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-15T23:57:36Z) - Self-Supervised Viewpoint Learning From Image Collections [116.56304441362994]
We propose a novel learning framework which incorporates an analysis-by-synthesis paradigm to reconstruct images in a viewpoint aware manner.
We show that our approach performs competitively to fully-supervised approaches for several object categories like human faces, cars, buses, and trains.
arXiv Detail & Related papers (2020-04-03T22:01:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.