Convergent transformations of visual representation in brains and models
- URL: http://arxiv.org/abs/2507.13941v1
- Date: Fri, 18 Jul 2025 14:13:54 GMT
- Title: Convergent transformations of visual representation in brains and models
- Authors: Pablo Marcos-Manchón, Lluís Fuentemilla,
- Abstract summary: A fundamental question in cognitive neuroscience is what shapes visual perception: the external world's structure or the brain's internal architecture.<n>We show a convergent computational solution for visual encoding in both human and artificial vision, driven by the structure of the external world.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A fundamental question in cognitive neuroscience is what shapes visual perception: the external world's structure or the brain's internal architecture. Although some perceptual variability can be traced to individual differences, brain responses to naturalistic stimuli evoke similar activity patterns across individuals, suggesting a convergent representational principle. Here, we test if this stimulus-driven convergence follows a common trajectory across people and deep neural networks (DNNs) during its transformation from sensory to high-level internal representations. We introduce a unified framework that traces representational flow by combining inter-subject similarity with alignment to model hierarchies. Applying this framework to three independent fMRI datasets of visual scene perception, we reveal a cortex-wide network, conserved across individuals, organized into two pathways: a medial-ventral stream for scene structure and a lateral-dorsal stream tuned for social and biological content. This functional organization is captured by the hierarchies of vision DNNs but not language models, reinforcing the specificity of the visual-to-semantic transformation. These findings show a convergent computational solution for visual encoding in both human and artificial vision, driven by the structure of the external world.
Related papers
- Representations in vision and language converge in a shared, multidimensional space of perceived similarities [0.0]
We show that visual and linguistic similarity judgements converge at the behavioural level.<n>We also predict a remarkably similar network of fMRI brain responses evoked by viewing the natural scene images.<n>These findings demonstrate that human visual and linguistic similarity judgements are grounded in a shared, modality-agnostic representational structure.
arXiv Detail & Related papers (2025-07-29T14:42:31Z) - Concept-Guided Interpretability via Neural Chunking [54.73787666584143]
We show that neural networks exhibit patterns in their raw population activity that mirror regularities in the training data.<n>We propose three methods to extract these emerging entities, complementing each other based on label availability and dimensionality.<n>Our work points to a new direction for interpretability, one that harnesses both cognitive principles and the structure of naturalistic data.
arXiv Detail & Related papers (2025-05-16T13:49:43Z) - Discovering Chunks in Neural Embeddings for Interpretability [53.80157905839065]
We propose leveraging the principle of chunking to interpret artificial neural population activities.<n>We first demonstrate this concept in recurrent neural networks (RNNs) trained on artificial sequences with imposed regularities.<n>We identify similar recurring embedding states corresponding to concepts in the input, with perturbations to these states activating or inhibiting the associated concepts.
arXiv Detail & Related papers (2025-02-03T20:30:46Z) - Universal dimensions of visual representation [0.8824340350342511]
We characterized the universality of hundreds of thousands of representational dimensions from visual neural networks with varied construction.<n>We found that networks with varied architectures learn to represent natural images using a shared set of latent dimensions.<n>The most brain-aligned representations in neural networks are those that are universal and independent of a network's specific characteristics.
arXiv Detail & Related papers (2024-08-23T02:48:44Z) - Interpretable Spatio-Temporal Embedding for Brain Structural-Effective Network with Ordinary Differential Equation [56.34634121544929]
In this study, we first construct the brain-effective network via the dynamic causal model.
We then introduce an interpretable graph learning framework termed Spatio-Temporal Embedding ODE (STE-ODE)
This framework incorporates specifically designed directed node embedding layers, aiming at capturing the dynamic interplay between structural and effective networks.
arXiv Detail & Related papers (2024-05-21T20:37:07Z) - A Dual-Stream Neural Network Explains the Functional Segregation of
Dorsal and Ventral Visual Pathways in Human Brains [8.24969449883056]
We develop a dual-stream vision model inspired by the human eyes and brain.
At the input level, the model samples two complementary visual patterns.
At the backend, the model processes the separate input patterns through two branches of convolutional neural networks.
arXiv Detail & Related papers (2023-10-20T22:47:40Z) - Joint fMRI Decoding and Encoding with Latent Embedding Alignment [77.66508125297754]
We introduce a unified framework that addresses both fMRI decoding and encoding.
Our model concurrently recovers visual stimuli from fMRI signals and predicts brain activity from images within a unified framework.
arXiv Detail & Related papers (2023-03-26T14:14:58Z) - Semantic Brain Decoding: from fMRI to conceptually similar image
reconstruction of visual stimuli [0.29005223064604074]
We propose a novel approach to brain decoding that also relies on semantic and contextual similarity.
We employ an fMRI dataset of natural image vision and create a deep learning decoding pipeline inspired by the existence of both bottom-up and top-down processes in human vision.
We produce reconstructions of visual stimuli that match the original content very well on a semantic level, surpassing the state of the art in previous literature.
arXiv Detail & Related papers (2022-12-13T16:54:08Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.