Category-Selective Neurons in Deep Networks: Comparing Purely Visual and Visual-Language Models
- URL: http://arxiv.org/abs/2502.16456v1
- Date: Sun, 23 Feb 2025 06:15:51 GMT
- Title: Category-Selective Neurons in Deep Networks: Comparing Purely Visual and Visual-Language Models
- Authors: Zitong Lu, Yuxin Wang,
- Abstract summary: Category-selective regions in the human brain play a crucial role in high-level visual processing.<n>We investigate whether artificial neural networks (ANNs) exhibit similar category-selective neurons.<n>Our study provides insights into how ANNs mirror biological vision and how multimodal learning influences category-selective representations.
- Score: 23.309064032922507
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Category-selective regions in the human brain, such as the fusiform face area (FFA), extrastriate body area (EBA), parahippocampal place area (PPA), and visual word form area (VWFA), play a crucial role in high-level visual processing. Here, we investigate whether artificial neural networks (ANNs) exhibit similar category-selective neurons and how these neurons vary across model layers and between purely visual and vision-language models. Inspired by fMRI functional localizer experiments, we presented images from different categories (faces, bodies, scenes, words, scrambled scenes, and scrambled words) to deep networks and identified category-selective neurons using statistical criteria. Comparing ResNet and the structurally controlled ResNet-based CLIP model, we found that both models contain category-selective neurons, with their proportion increasing across layers, mirroring category selectivity in higher-level visual brain regions. However, CLIP exhibited a higher proportion but lower specificity of category-selective neurons compared to ResNet. Additionally, CLIP's category-selective neurons were more evenly distributed across feature maps and demonstrated greater representational consistency across layers. These findings suggest that language learning increases the number of category-selective neurons while reducing their selectivity strength, reshaping visual representations in deep networks. Our study provides insights into how ANNs mirror biological vision and how multimodal learning influences category-selective representations.
Related papers
- Sharing Matters: Analysing Neurons Across Languages and Tasks in LLMs [70.3132264719438]
We aim to fill the research gap by examining how neuron activation is shared across tasks and languages.
We classify neurons into four distinct categories based on their responses to a specific input across different languages.
Our analysis reveals the following insights: (i) the patterns of neuron sharing are significantly affected by the characteristics of tasks and examples; (ii) neuron sharing does not fully correspond with language similarity; (iii) shared neurons play a vital role in generating responses, especially those shared across all languages.
arXiv Detail & Related papers (2024-06-13T16:04:11Z) - Finding Shared Decodable Concepts and their Negations in the Brain [4.111712524255376]
We train a highly accurate contrastive model that maps brain responses during naturalistic image viewing to CLIP embeddings.
We then use a novel adaptation of the DBSCAN clustering algorithm to cluster the parameters of participant-specific contrastive models.
Examining the images most and least associated with each SDC cluster gives us additional insight into the semantic properties of each SDC.
arXiv Detail & Related papers (2024-05-27T21:28:26Z) - Parallel Backpropagation for Shared-Feature Visualization [36.31730251757713]
Recent work has shown that some out-of-category stimuli also activate neurons in high-level visual brain regions.
This may be due to visual features common among the preferred class also being present in other images.
Here, we propose a deep-learning-based approach for visualizing these features.
arXiv Detail & Related papers (2024-05-16T05:56:03Z) - SPIN: Sparsifying and Integrating Internal Neurons in Large Language Models for Text Classification [6.227343685358882]
We present a model-agnostic framework that sparsifies and integrates internal neurons of intermediate layers of Large Language Models for text classification.
SPIN significantly improves text classification accuracy, efficiency, and interpretability.
arXiv Detail & Related papers (2023-11-27T16:28:20Z) - Deep Spiking Neural Networks with High Representation Similarity Model
Visual Pathways of Macaque and Mouse [17.545204435882816]
Spiking Neural Networks (SNNs) are more biologically plausible models since spiking neurons encode information with time sequences of spikes.
In this study, we model the visual cortex with deep SNNs for the first time, and also with a wide range of state-of-the-art deep CNNs and ViTs for comparison.
Almost all similarity scores of SNNs are higher than their counterparts of CNNs with an average of 6.6%.
arXiv Detail & Related papers (2023-03-09T13:07:30Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Natural Language Descriptions of Deep Visual Features [50.270035018478666]
We introduce a procedure that automatically labels neurons with open-ended, compositional, natural language descriptions.
We use MILAN for analysis, characterizing the distribution and importance of neurons selective for attribute, category, and relational information in vision models.
We also use MILAN for auditing, surfacing neurons sensitive to protected categories like race and gender in models trained on datasets intended to obscure these features.
arXiv Detail & Related papers (2022-01-26T18:48:02Z) - Modeling Category-Selective Cortical Regions with Topographic
Variational Autoencoders [72.15087604017441]
Category-selectivity describes the observation that certain spatially localized areas of the cerebral cortex tend to respond robustly and selectively to stimuli from specific limited categories.
We leverage the newly introduced Topographic Variational Autoencoder to model of the emergence of such localized category-selectivity in an unsupervised manner.
We show preliminary results suggesting that our model yields a nested spatial hierarchy of increasingly abstract categories, analogous to observations from the human ventral temporal cortex.
arXiv Detail & Related papers (2021-10-25T11:37:41Z) - The Selectivity and Competition of the Mind's Eye in Visual Perception [8.411385346896411]
We create a novel computational model that incorporates lateral and top down feedback in the form of hierarchical competition.
Not only do we show that these elements can help explain the information flow and selectivity of high level areas within the brain, we also demonstrate that these neural mechanisms provide the foundation of a novel classification framework.
arXiv Detail & Related papers (2020-11-23T01:55:46Z) - Non-linear Neurons with Human-like Apical Dendrite Activations [81.18416067005538]
We show that a standard neuron followed by our novel apical dendrite activation (ADA) can learn the XOR logical function with 100% accuracy.
We conduct experiments on six benchmark data sets from computer vision, signal processing and natural language processing.
arXiv Detail & Related papers (2020-02-02T21:09:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.