System Identification of Neural Systems: Going Beyond Images to Modelling Dynamics
- URL: http://arxiv.org/abs/2402.12519v2
- Date: Sun, 13 Oct 2024 19:16:02 GMT
- Title: System Identification of Neural Systems: Going Beyond Images to Modelling Dynamics
- Authors: Mai Gamal, Mohamed Rashad, Eman Ehab, Seif Eldawlatly, Mennatullah Siam,
- Abstract summary: We propose the first large-scale study focused on comparing video understanding models with respect to the visual cortex recordings using video stimuli.
We provide key insights on how video understanding models predict visual cortex responses.
We propose a novel neural encoding scheme that is built on top of the best performing video understanding models.
- Score: 2.3825930751052358
- License:
- Abstract: Extensive literature has drawn comparisons between recordings of biological neurons in the brain and deep neural networks. This comparative analysis aims to advance and interpret deep neural networks and enhance our understanding of biological neural systems. However, previous works did not consider the time aspect and how the encoding of video and dynamics in deep networks relate to the biological neural systems within a large-scale comparison. Towards this end, we propose the first large-scale study focused on comparing video understanding models with respect to the visual cortex recordings using video stimuli. The study encompasses more than two million regression fits, examining image vs. video understanding, convolutional vs. transformer-based and fully vs. self-supervised models. Additionally, we propose a novel neural encoding scheme to better encode biological neural systems. We provide key insights on how video understanding models predict visual cortex responses; showing video understanding better than image understanding models, convolutional models are better in the early-mid visual cortical regions than transformer based ones except for multiscale transformers, and that two-stream models are better than single stream. Furthermore, we propose a novel neural encoding scheme that is built on top of the best performing video understanding models, while incorporating inter-intra region connectivity across the visual cortex. Our neural encoding leverages the encoded dynamics from video stimuli, through utilizing two-stream networks and multiscale transformers, while taking connectivity priors into consideration. Our results show that merging both intra and inter-region connectivity priors increases the encoding performance over each one of them standalone or no connectivity priors. It also shows the necessity for encoding dynamics to fully benefit from such connectivity priors.
Related papers
- Unsupervised representation learning with Hebbian synaptic and structural plasticity in brain-like feedforward neural networks [0.0]
We introduce and evaluate a brain-like neural network model capable of unsupervised representation learning.
The model was tested on a diverse set of popular machine learning benchmarks.
arXiv Detail & Related papers (2024-06-07T08:32:30Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - A Dual-Stream Neural Network Explains the Functional Segregation of
Dorsal and Ventral Visual Pathways in Human Brains [8.24969449883056]
We develop a dual-stream vision model inspired by the human eyes and brain.
At the input level, the model samples two complementary visual patterns.
At the backend, the model processes the separate input patterns through two branches of convolutional neural networks.
arXiv Detail & Related papers (2023-10-20T22:47:40Z) - Multimodal Neurons in Pretrained Text-Only Transformers [52.20828443544296]
We identify "multimodal neurons" that convert visual representations into corresponding text.
We show that multimodal neurons operate on specific visual concepts across inputs, and have a systematic causal effect on image captioning.
arXiv Detail & Related papers (2023-08-03T05:27:12Z) - Visio-Linguistic Brain Encoding [3.944020612420711]
We systematically explore the efficacy of image Transformers and multi-modal Transformers for brain encoding.
We find that VisualBERT, a multi-modal Transformer, significantly outperforms previously proposed single-mode CNNs.
The supremacy of visio-linguistic models raises the question of whether the responses elicited in the visual regions are affected implicitly by linguistic processing.
arXiv Detail & Related papers (2022-04-18T11:28:18Z) - Improving Neural Predictivity in the Visual Cortex with Gated Recurrent
Connections [0.0]
We aim to shift the focus on architectures that take into account lateral recurrent connections, a ubiquitous feature of the ventral visual stream, to devise adaptive receptive fields.
In order to increase the robustness of our approach and the biological fidelity of the activations, we employ specific data augmentation techniques.
arXiv Detail & Related papers (2022-03-22T17:27:22Z) - Neural Data-Dependent Transform for Learned Image Compression [72.86505042102155]
We build a neural data-dependent transform and introduce a continuous online mode decision mechanism to jointly optimize the coding efficiency for each individual image.
The experimental results show the effectiveness of the proposed neural-syntax design and the continuous online mode decision mechanism.
arXiv Detail & Related papers (2022-03-09T14:56:48Z) - A Coding Framework and Benchmark towards Low-Bitrate Video Understanding [63.05385140193666]
We propose a traditional-neural mixed coding framework that takes advantage of both traditional codecs and neural networks (NNs)
The framework is optimized by ensuring that a transportation-efficient semantic representation of the video is preserved.
We build a low-bitrate video understanding benchmark with three downstream tasks on eight datasets, demonstrating the notable superiority of our approach.
arXiv Detail & Related papers (2022-02-06T16:29:15Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Deep Auto-encoder with Neural Response [8.797970797884023]
We propose a hybrid model, called deep auto-encoder with the neural response (DAE-NR)
The DAE-NR incorporates the information from the visual cortex into ANNs to achieve better image reconstruction and higher neural representation similarity between biological and artificial neurons.
Our experiments demonstrate that if and only if with the joint learning, DAE-NRs can (i.e., improve the performance of image reconstruction) and (ii. increase the representational similarity between biological neurons and artificial neurons.
arXiv Detail & Related papers (2021-11-30T11:44:17Z) - Neural Human Video Rendering by Learning Dynamic Textures and
Rendering-to-Video Translation [99.64565200170897]
We propose a novel human video synthesis method by explicitly disentangling the learning of time-coherent fine-scale details from the embedding of the human in 2D screen space.
We show several applications of our approach, such as human reenactment and novel view synthesis from monocular video, where we show significant improvement over the state of the art both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-01-14T18:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.