Universal dimensions of visual representation
- URL: http://arxiv.org/abs/2408.12804v1
- Date: Fri, 23 Aug 2024 02:48:44 GMT
- Title: Universal dimensions of visual representation
- Authors: Zirui Chen, Michael F. Bonner,
- Abstract summary: We characterized the universality of hundreds of thousands of representational dimensions from visual neural networks with varied construction.
We found that networks with varied architectures learn to represent natural images using a shared set of latent dimensions.
The most brain-aligned representations in neural networks are those that are universal and independent of a network's specific characteristics.
- Score: 0.8824340350342511
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Do neural network models of vision learn brain-aligned representations because they share architectural constraints and task objectives with biological vision or because they learn universal features of natural image processing? We characterized the universality of hundreds of thousands of representational dimensions from visual neural networks with varied construction. We found that networks with varied architectures and task objectives learn to represent natural images using a shared set of latent dimensions, despite appearing highly distinct at a surface level. Next, by comparing these networks with human brain representations measured with fMRI, we found that the most brain-aligned representations in neural networks are those that are universal and independent of a network's specific characteristics. Remarkably, each network can be reduced to fewer than ten of its most universal dimensions with little impact on its representational similarity to the human brain. These results suggest that the underlying similarities between artificial and biological vision are primarily governed by a core set of universal image representations that are convergently learned by diverse systems.
Related papers
- On the universality of neural encodings in CNNs [5.064404027153094]
We show that, for a range of layers of VGG-type networks, the learned eigenvectors appear to be universal across different natural image datasets.
They explain, at a more fundamental level, the success of transfer learning.
arXiv Detail & Related papers (2024-09-28T21:30:25Z) - Image segmentation with traveling waves in an exactly solvable recurrent
neural network [71.74150501418039]
We show that a recurrent neural network can effectively divide an image into groups according to a scene's structural characteristics.
We present a precise description of the mechanism underlying object segmentation in this network.
We then demonstrate a simple algorithm for object segmentation that generalizes across inputs ranging from simple geometric objects in grayscale images to natural images.
arXiv Detail & Related papers (2023-11-28T16:46:44Z) - Connecting metrics for shape-texture knowledge in computer vision [1.7785095623975342]
Deep neural networks remain brittle and susceptible to many changes in the image that do not cause humans to misclassify images.
Part of this different behavior may be explained by the type of features humans and deep neural networks use in vision tasks.
arXiv Detail & Related papers (2023-01-25T14:37:42Z) - Formal Conceptual Views in Neural Networks [0.0]
We introduce two notions for conceptual views of a neural network, specifically a many-valued and a symbolic view.
We test the conceptual expressivity of our novel views through different experiments on the ImageNet and Fruit-360 data sets.
We demonstrate how conceptual views can be applied for abductive learning of human comprehensible rules from neurons.
arXiv Detail & Related papers (2022-09-27T16:38:24Z) - A domain adaptive deep learning solution for scanpath prediction of
paintings [66.46953851227454]
This paper focuses on the eye-movement analysis of viewers during the visual experience of a certain number of paintings.
We introduce a new approach to predicting human visual attention, which impacts several cognitive functions for humans.
The proposed new architecture ingests images and returns scanpaths, a sequence of points featuring a high likelihood of catching viewers' attention.
arXiv Detail & Related papers (2022-09-22T22:27:08Z) - Peripheral Vision Transformer [52.55309200601883]
We take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition.
We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data.
We evaluate the proposed network, dubbed PerViT, on the large-scale ImageNet dataset and systematically investigate the inner workings of the model for machine perception.
arXiv Detail & Related papers (2022-06-14T12:47:47Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Grounding Psychological Shape Space in Convolutional Neural Networks [0.0]
We use convolutional neural networks to learn a generalizable mapping between perceptual inputs and a recently proposed psychological similarity space for the shape domain.
Our results indicate that a classification-based multi-task learning scenario yields the best results, but that its performance is relatively sensitive to the dimensionality of the similarity space.
arXiv Detail & Related papers (2021-11-16T12:21:07Z) - Aesthetics and neural network image representations [0.0]
We analyze the spaces of images encoded by generative neural networks of the BigGAN architecture.
We find that generic multiplicative perturbations of neural network parameters away from the photo-realistic point often lead to networks generating images which appear as "artistic renditions" of the corresponding objects.
arXiv Detail & Related papers (2021-09-16T16:50:22Z) - Understanding the Role of Individual Units in a Deep Neural Network [85.23117441162772]
We present an analytic framework to systematically identify hidden units within image classification and image generation networks.
First, we analyze a convolutional neural network (CNN) trained on scene classification and discover units that match a diverse set of object concepts.
Second, we use a similar analytic method to analyze a generative adversarial network (GAN) model trained to generate scenes.
arXiv Detail & Related papers (2020-09-10T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.