Related papers: Extracting low-dimensional psychological representations from convolutional neural networks

Extracting low-dimensional psychological representations from convolutional neural networks

URL: http://arxiv.org/abs/2005.14363v1
Date: Fri, 29 May 2020 01:29:39 GMT
Title: Extracting low-dimensional psychological representations from convolutional neural networks
Authors: Aditi Jha, Joshua Peterson, Thomas L. Griffiths
Abstract summary: We present a method for reducing neural network representations to a low-dimensional space which is still predictive of similarity judgments. We show that these low-dimensional representations also provide insightful explanations of factors underlying human similarity judgments.
Score: 10.269997499911666
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks are increasingly being used in cognitive modeling as a means of deriving representations for complex stimuli such as images. While the predictive power of these networks is high, it is often not clear whether they also offer useful explanations of the task at hand. Convolutional neural network representations have been shown to be predictive of human similarity judgments for images after appropriate adaptation. However, these high-dimensional representations are difficult to interpret. Here we present a method for reducing these representations to a low-dimensional space which is still predictive of similarity judgments. We show that these low-dimensional representations also provide insightful explanations of factors underlying human similarity judgments.

Related papers

VITAL: More Understandable Feature Visualization through Distribution Alignment and Relevant Information Flow [57.96482272333649]
Feature visualization (FV) is a powerful tool to decode what information neurons are responding to. We propose to guide FV through statistics of prototypical image features combined with measures of relevant network flow to generate images. Our approach yields human-understandable visualizations that both qualitatively and quantitatively improve over state-of-the-art FVs.
arXiv Detail & Related papers (2025-03-28T13:08:18Z)
From superposition to sparse codes: interpretable representations in neural networks [3.6738925004882685]
Recent evidence suggests that neural networks encode features in superposition, meaning that input concepts are linearly overlaid within the network's representations. We present a perspective that explains this phenomenon and provides a foundation for extracting interpretable representations from neural activations. Our arguments have implications for neural coding theories, AI transparency, and the broader goal of making deep learning models more interpretable.
arXiv Detail & Related papers (2025-03-03T18:49:59Z)
Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities. We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities. We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z)
Local vs distributed representations: What is the right basis for interpretability? [19.50614357801837]
We show that features obtained from sparse distributed representations are easier to interpret by human observers. Our results highlight that distributed representations constitute a superior basis for interpretability.
arXiv Detail & Related papers (2024-11-06T15:34:57Z)
Coding schemes in neural networks learning classification tasks [52.22978725954347]
We investigate fully-connected, wide neural networks learning classification tasks. We show that the networks acquire strong, data-dependent features. Surprisingly, the nature of the internal representations depends crucially on the neuronal nonlinearity.
arXiv Detail & Related papers (2024-06-24T14:50:05Z)
Finding Concept Representations in Neural Networks with Self-Organizing Maps [2.817412580574242]
We show how self-organizing maps can be used to inspect how activation of layers of neural networks correspond to neural representations of abstract concepts. We show that, among the measures tested, the relative entropy of the activation map for a concept is a suitable candidate and can be used as part of a methodology to identify and locate the neural representation of a concept.
arXiv Detail & Related papers (2023-12-10T12:10:34Z)
Image segmentation with traveling waves in an exactly solvable recurrent neural network [71.74150501418039]
We show that a recurrent neural network can effectively divide an image into groups according to a scene's structural characteristics. We present a precise description of the mechanism underlying object segmentation in this network. We then demonstrate a simple algorithm for object segmentation that generalizes across inputs ranging from simple geometric objects in grayscale images to natural images.
arXiv Detail & Related papers (2023-11-28T16:46:44Z)
Seeing in Words: Learning to Classify through Language Bottlenecks [59.97827889540685]
Humans can explain their predictions using succinct and intuitive descriptions. We show that a vision model whose feature representations are text can effectively classify ImageNet images.
arXiv Detail & Related papers (2023-06-29T00:24:42Z)
Searching for the Essence of Adversarial Perturbations [73.96215665913797]
We show that adversarial perturbations contain human-recognizable information, which is the key conspirator responsible for a neural network's erroneous prediction. This concept of human-recognizable information allows us to explain key features related to adversarial perturbations.
arXiv Detail & Related papers (2022-05-30T18:04:57Z)
Aesthetics and neural network image representations [0.0]
We analyze the spaces of images encoded by generative neural networks of the BigGAN architecture. We find that generic multiplicative perturbations of neural network parameters away from the photo-realistic point often lead to networks generating images which appear as "artistic renditions" of the corresponding objects.
arXiv Detail & Related papers (2021-09-16T16:50:22Z)
A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation. Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z)
Transforming Neural Network Visual Representations to Predict Human Judgments of Similarity [12.5719993304358]
We investigate how to bring machine visual representations into better alignment with human representations. We find that with appropriate linear transformations of deep embeddings, we can improve prediction of human binary choice.
arXiv Detail & Related papers (2020-10-13T16:09:47Z)
The Representation Theory of Neural Networks [7.724617675868718]
We show that neural networks can be represented via the mathematical theory of quiver representations. We show that network quivers gently adapt to common neural network concepts. We also provide a quiver representation model to understand how a neural network creates representations from the data.
arXiv Detail & Related papers (2020-07-23T19:02:14Z)
Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes [51.31334977346847]
We train networks to form coarse impressions based on the information in higher bit planes, and use the lower bit planes only to refine their prediction. We demonstrate that, by imposing consistency on the representations learned across differently quantized images, the adversarial robustness of networks improves significantly.
arXiv Detail & Related papers (2020-04-01T09:31:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.