A Dual-Stream Neural Network Explains the Functional Segregation of
Dorsal and Ventral Visual Pathways in Human Brains
- URL: http://arxiv.org/abs/2310.13849v2
- Date: Mon, 20 Nov 2023 17:23:17 GMT
- Title: A Dual-Stream Neural Network Explains the Functional Segregation of
Dorsal and Ventral Visual Pathways in Human Brains
- Authors: Minkyu Choi, Kuan Han, Xiaokai Wang, Yizhen Zhang, Zhongming Liu
- Abstract summary: We develop a dual-stream vision model inspired by the human eyes and brain.
At the input level, the model samples two complementary visual patterns.
At the backend, the model processes the separate input patterns through two branches of convolutional neural networks.
- Score: 8.24969449883056
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The human visual system uses two parallel pathways for spatial processing and
object recognition. In contrast, computer vision systems tend to use a single
feedforward pathway, rendering them less robust, adaptive, or efficient than
human vision. To bridge this gap, we developed a dual-stream vision model
inspired by the human eyes and brain. At the input level, the model samples two
complementary visual patterns to mimic how the human eyes use magnocellular and
parvocellular retinal ganglion cells to separate retinal inputs to the brain.
At the backend, the model processes the separate input patterns through two
branches of convolutional neural networks (CNN) to mimic how the human brain
uses the dorsal and ventral cortical pathways for parallel visual processing.
The first branch (WhereCNN) samples a global view to learn spatial attention
and control eye movements. The second branch (WhatCNN) samples a local view to
represent the object around the fixation. Over time, the two branches interact
recurrently to build a scene representation from moving fixations. We compared
this model with the human brains processing the same movie and evaluated their
functional alignment by linear transformation. The WhereCNN and WhatCNN
branches were found to differentially match the dorsal and ventral pathways of
the visual cortex, respectively, primarily due to their different learning
objectives. These model-based results lead us to speculate that the distinct
responses and representations of the ventral and dorsal streams are more
influenced by their distinct goals in visual attention and object recognition
than by their specific bias or selectivity in retinal inputs. This dual-stream
model takes a further step in brain-inspired computer vision, enabling parallel
neural networks to actively explore and understand the visual surroundings.
Related papers
- BIMM: Brain Inspired Masked Modeling for Video Representation Learning [47.56270575865621]
We propose the Brain Inspired Masked Modeling (BIMM) framework, aiming to learn comprehensive representations from videos.
Specifically, our approach consists of ventral and dorsal branches, which learn image and video representations, respectively.
To achieve the goals of different visual cortices in the brain, we segment the encoder of each branch into three intermediate blocks and reconstruct progressive prediction targets with light weight decoders.
arXiv Detail & Related papers (2024-05-21T13:09:04Z) - Towards Two-Stream Foveation-based Active Vision Learning [7.14325008286629]
"Two-stream hypothesis" from neuroscience explains the neural processing in the human visual cortex as an active vision system.
We propose a machine learning framework inspired by the "two-stream hypothesis" and explore the potential benefits that it offers.
We show that the two-stream foveation-based learning is applicable to the challenging task of weakly-supervised object localization.
arXiv Detail & Related papers (2024-03-24T01:20:08Z) - System Identification of Neural Systems: Going Beyond Images to Modelling Dynamics [2.3825930751052358]
We propose the first large-scale study focused on comparing video understanding models with respect to the visual cortex recordings using video stimuli.
We provide key insights on how video understanding models predict visual cortex responses.
We propose a novel neural encoding scheme that is built on top of the best performing video understanding models.
arXiv Detail & Related papers (2024-02-19T20:29:49Z) - BI AVAN: Brain inspired Adversarial Visual Attention Network [67.05560966998559]
We propose a brain-inspired adversarial visual attention network (BI-AVAN) to characterize human visual attention directly from functional brain activity.
Our model imitates the biased competition process between attention-related/neglected objects to identify and locate the visual objects in a movie frame the human brain focuses on in an unsupervised manner.
arXiv Detail & Related papers (2022-10-27T22:20:36Z) - Human Eyes Inspired Recurrent Neural Networks are More Robust Against Adversarial Noises [7.689542442882423]
We designed a dual-stream vision model inspired by the human brain.
This model features retina-like input layers and includes two streams: one determining the next point of focus (the fixation), while the other interprets the visuals surrounding the fixation.
We evaluated this model against various benchmarks in terms of object recognition, gaze behavior and adversarial robustness.
arXiv Detail & Related papers (2022-06-15T03:44:42Z) - Peripheral Vision Transformer [52.55309200601883]
We take a biologically inspired approach and explore to model peripheral vision in deep neural networks for visual recognition.
We propose to incorporate peripheral position encoding to the multi-head self-attention layers to let the network learn to partition the visual field into diverse peripheral regions given training data.
We evaluate the proposed network, dubbed PerViT, on the large-scale ImageNet dataset and systematically investigate the inner workings of the model for machine perception.
arXiv Detail & Related papers (2022-06-14T12:47:47Z) - Prune and distill: similar reformatting of image information along rat
visual cortex and deep neural networks [61.60177890353585]
Deep convolutional neural networks (CNNs) have been shown to provide excellent models for its functional analogue in the brain, the ventral stream in visual cortex.
Here we consider some prominent statistical patterns that are known to exist in the internal representations of either CNNs or the visual cortex.
We show that CNNs and visual cortex share a similarly tight relationship between dimensionality expansion/reduction of object representations and reformatting of image information.
arXiv Detail & Related papers (2022-05-27T08:06:40Z) - Functional2Structural: Cross-Modality Brain Networks Representation
Learning [55.24969686433101]
Graph mining on brain networks may facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases.
We propose a novel graph learning framework, known as Deep Signed Brain Networks (DSBN), with a signed graph encoder.
We validate our framework on clinical phenotype and neurodegenerative disease prediction tasks using two independent, publicly available datasets.
arXiv Detail & Related papers (2022-05-06T03:45:36Z) - Visual Attention Network [90.0753726786985]
We propose a novel large kernel attention (LKA) module to enable self-adaptive and long-range correlations in self-attention.
We also introduce a novel neural network based on LKA, namely Visual Attention Network (VAN)
VAN outperforms the state-of-the-art vision transformers and convolutional neural networks with a large margin in extensive experiments.
arXiv Detail & Related papers (2022-02-20T06:35:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.