Natural Language Descriptions of Deep Visual Features
- URL: http://arxiv.org/abs/2201.11114v1
- Date: Wed, 26 Jan 2022 18:48:02 GMT
- Title: Natural Language Descriptions of Deep Visual Features
- Authors: Evan Hernandez, Sarah Schwettmann, David Bau, Teona Bagashvili,
Antonio Torralba, and Jacob Andreas
- Abstract summary: We introduce a procedure that automatically labels neurons with open-ended, compositional, natural language descriptions.
We use MILAN for analysis, characterizing the distribution and importance of neurons selective for attribute, category, and relational information in vision models.
We also use MILAN for auditing, surfacing neurons sensitive to protected categories like race and gender in models trained on datasets intended to obscure these features.
- Score: 50.270035018478666
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Some neurons in deep networks specialize in recognizing highly specific
perceptual, structural, or semantic features of inputs. In computer vision,
techniques exist for identifying neurons that respond to individual concept
categories like colors, textures, and object classes. But these techniques are
limited in scope, labeling only a small subset of neurons and behaviors in any
network. Is a richer characterization of neuron-level computation possible? We
introduce a procedure (called MILAN, for mutual-information-guided linguistic
annotation of neurons) that automatically labels neurons with open-ended,
compositional, natural language descriptions. Given a neuron, MILAN generates a
description by searching for a natural language string that maximizes pointwise
mutual information with the image regions in which the neuron is active. MILAN
produces fine-grained descriptions that capture categorical, relational, and
logical structure in learned features. These descriptions obtain high agreement
with human-generated feature descriptions across a diverse set of model
architectures and tasks, and can aid in understanding and controlling learned
models. We highlight three applications of natural language neuron
descriptions. First, we use MILAN for analysis, characterizing the distribution
and importance of neurons selective for attribute, category, and relational
information in vision models. Second, we use MILAN for auditing, surfacing
neurons sensitive to protected categories like race and gender in models
trained on datasets intended to obscure these features. Finally, we use MILAN
for editing, improving robustness in an image classifier by deleting neurons
sensitive to text features spuriously correlated with class labels.
Related papers
- Learning Multimodal Volumetric Features for Large-Scale Neuron Tracing [72.45257414889478]
We aim to reduce human workload by predicting connectivity between over-segmented neuron pieces.
We first construct a dataset, named FlyTracing, that contains millions of pairwise connections of segments expanding the whole fly brain.
We propose a novel connectivity-aware contrastive learning method to generate dense volumetric EM image embedding.
arXiv Detail & Related papers (2024-01-05T19:45:12Z) - Investigating the Encoding of Words in BERT's Neurons using Feature
Textualization [11.943486282441143]
We propose a technique to produce representations of neurons in embedding word space.
We find that the produced representations can provide insights about the encoded knowledge in individual neurons.
arXiv Detail & Related papers (2023-11-14T15:21:49Z) - Identifying Interpretable Visual Features in Artificial and Biological
Neural Systems [3.604033202771937]
Single neurons in neural networks are often interpretable in that they represent individual, intuitively meaningful features.
Many neurons exhibit $textitmixed selectivity$, i.e., they represent multiple unrelated features.
We propose an automated method for quantifying visual interpretability and an approach for finding meaningful directions in network activation space.
arXiv Detail & Related papers (2023-10-17T17:41:28Z) - Constraints on the design of neuromorphic circuits set by the properties
of neural population codes [61.15277741147157]
In the brain, information is encoded, transmitted and used to inform behaviour.
Neuromorphic circuits need to encode information in a way compatible to that used by populations of neuron in the brain.
arXiv Detail & Related papers (2022-12-08T15:16:04Z) - Discovering Salient Neurons in Deep NLP Models [31.18937787704794]
We present a technique called as Linguistic Correlation Analysis to extract salient neurons in the model.
Our data-driven, quantitative analysis illuminates interesting findings.
Our code is publicly available as part of the NeuroX toolkit.
arXiv Detail & Related papers (2022-06-27T13:31:49Z) - Low-Dimensional Structure in the Space of Language Representations is
Reflected in Brain Responses [62.197912623223964]
We show a low-dimensional structure where language models and translation models smoothly interpolate between word embeddings, syntactic and semantic tasks, and future word embeddings.
We find that this representation embedding can predict how well each individual feature space maps to human brain responses to natural language stimuli recorded using fMRI.
This suggests that the embedding captures some part of the brain's natural language representation structure.
arXiv Detail & Related papers (2021-06-09T22:59:12Z) - Analyzing Individual Neurons in Pre-trained Language Models [41.07850306314594]
We find small subsets of neurons to predict linguistic tasks, with lower level tasks localized in fewer neurons, compared to higher level task of predicting syntax.
For example, we found neurons in XLNet to be more localized and disjoint when predicting properties compared to BERT and others, where they are more distributed and coupled.
arXiv Detail & Related papers (2020-10-06T13:17:38Z) - Compositional Explanations of Neurons [52.71742655312625]
We describe a procedure for explaining neurons in deep representations by identifying compositional logical concepts.
We use this procedure to answer several questions on interpretability in models for vision and natural language processing.
arXiv Detail & Related papers (2020-06-24T20:37:05Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.