Neural encoding and interpretation for high-level visual cortices based
on fMRI using image caption features
- URL: http://arxiv.org/abs/2003.11797v1
- Date: Thu, 26 Mar 2020 08:47:21 GMT
- Title: Neural encoding and interpretation for high-level visual cortices based
on fMRI using image caption features
- Authors: Kai Qiao, Chi Zhang, Jian Chen, Linyuan Wang, Li Tong, Bin Yan
- Abstract summary: This study introduces one higher-level vision task: image caption (IC) task and proposed the visual encoding model based on IC features (ICFVEM) to encode voxels of high-level visual cortices.
- Score: 14.038605815510145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: On basis of functional magnetic resonance imaging (fMRI), researchers are
devoted to designing visual encoding models to predict the neuron activity of
human in response to presented image stimuli and analyze inner mechanism of
human visual cortices. Deep network structure composed of hierarchical
processing layers forms deep network models by learning features of data on
specific task through big dataset. Deep network models have powerful and
hierarchical representation of data, and have brought about breakthroughs for
visual encoding, while revealing hierarchical structural similarity with the
manner of information processing in human visual cortices. However, previous
studies almost used image features of those deep network models pre-trained on
classification task to construct visual encoding models. Except for deep
network structure, the task or corresponding big dataset is also important for
deep network models, but neglected by previous studies. Because image
classification is a relatively fundamental task, it is difficult to guide deep
network models to master high-level semantic representations of data, which
causes into that encoding performance for high-level visual cortices is
limited. In this study, we introduced one higher-level vision task: image
caption (IC) task and proposed the visual encoding model based on IC features
(ICFVEM) to encode voxels of high-level visual cortices. Experiment
demonstrated that ICFVEM obtained better encoding performance than previous
deep network models pre-trained on classification task. In addition, the
interpretation of voxels was realized to explore the detailed characteristics
of voxels based on the visualization of semantic words, and comparative
analysis implied that high-level visual cortices behaved the correlative
representation of image content.
Related papers
- Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - MindTuner: Cross-Subject Visual Decoding with Visual Fingerprint and Semantic Correction [21.531569319105877]
Reconstructing high-quality images in cross-subject tasks is a challenging problem due to profound individual differences between subjects.
MindTuner achieves high-quality and rich-semantic reconstructions using only 1 hour of fMRI training data.
arXiv Detail & Related papers (2024-04-19T05:12:04Z) - Emergent Language Symbolic Autoencoder (ELSA) with Weak Supervision to Model Hierarchical Brain Networks [0.12075823996747355]
Brain networks display a hierarchical organization, a complexity that poses a challenge for existing deep learning models.
We propose a symbolic autoencoder informed by weak supervision and an Emergent Language (EL) framework.
Our innovation includes a generalized hierarchical loss function designed to ensure that both sentences and images accurately reflect the hierarchical structure of functional brain networks.
arXiv Detail & Related papers (2024-04-15T13:51:05Z) - See Through Their Minds: Learning Transferable Neural Representation from Cross-Subject fMRI [32.40827290083577]
Deciphering visual content from functional Magnetic Resonance Imaging (fMRI) helps illuminate the human vision system.
Previous approaches primarily employ subject-specific models, sensitive to training sample size.
We propose shallow subject-specific adapters to map cross-subject fMRI data into unified representations.
During training, we leverage both visual and textual supervision for multi-modal brain decoding.
arXiv Detail & Related papers (2024-03-11T01:18:49Z) - Image complexity based fMRI-BOLD visual network categorization across
visual datasets using topological descriptors and deep-hybrid learning [3.522950356329991]
The aim of this study is to examine how network topology differs in response to distinct visual stimuli from visual datasets.
To achieve this, 0- and 1-dimensional persistence diagrams are computed for each visual network representing COCO, ImageNet, and SUN.
The extracted K-means cluster features are fed to a novel deep-hybrid model that yields accuracy in the range of 90%-95% in classifying these visual networks.
arXiv Detail & Related papers (2023-11-03T14:05:57Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner.
Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z) - HistoTransfer: Understanding Transfer Learning for Histopathology [9.231495418218813]
We compare the performance of features extracted from networks trained on ImageNet and histopathology data.
We investigate if features learned using more complex networks lead to gain in performance.
arXiv Detail & Related papers (2021-06-13T18:55:23Z) - Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications.
The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z) - Retinopathy of Prematurity Stage Diagnosis Using Object Segmentation and
Convolutional Neural Networks [68.96150598294072]
Retinopathy of Prematurity (ROP) is an eye disorder primarily affecting premature infants with lower weights.
It causes proliferation of vessels in the retina and could result in vision loss and, eventually, retinal detachment, leading to blindness.
In recent years, there has been a significant effort to automate the diagnosis using deep learning.
This paper builds upon the success of previous models and develops a novel architecture, which combines object segmentation and convolutional neural networks (CNN)
Our proposed system first trains an object segmentation model to identify the demarcation line at a pixel level and adds the resulting mask as an additional "color" channel in
arXiv Detail & Related papers (2020-04-03T14:07:41Z) - Image Segmentation Using Deep Learning: A Survey [58.37211170954998]
Image segmentation is a key topic in image processing and computer vision.
There has been a substantial amount of works aimed at developing image segmentation approaches using deep learning models.
arXiv Detail & Related papers (2020-01-15T21:37:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.