CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic
Decoding
- URL: http://arxiv.org/abs/2402.08994v1
- Date: Wed, 14 Feb 2024 07:41:48 GMT
- Title: CLIP-MUSED: CLIP-Guided Multi-Subject Visual Neural Information Semantic
Decoding
- Authors: Qiongyi Zhou, Changde Du, Shengpei Wang, Huiguang He
- Abstract summary: We propose a CLIP-guided Multi-sUbject visual neural information SEmantic Decoding (CLIP-MUSED) method.
Our method consists of a Transformer-based feature extractor to effectively model global neural representations.
It also incorporates learnable subject-specific tokens that facilitates the aggregation of multi-subject data.
- Score: 14.484475792279671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The study of decoding visual neural information faces challenges in
generalizing single-subject decoding models to multiple subjects, due to
individual differences. Moreover, the limited availability of data from a
single subject has a constraining impact on model performance. Although prior
multi-subject decoding methods have made significant progress, they still
suffer from several limitations, including difficulty in extracting global
neural response features, linear scaling of model parameters with the number of
subjects, and inadequate characterization of the relationship between neural
responses of different subjects to various stimuli. To overcome these
limitations, we propose a CLIP-guided Multi-sUbject visual neural information
SEmantic Decoding (CLIP-MUSED) method. Our method consists of a
Transformer-based feature extractor to effectively model global neural
representations. It also incorporates learnable subject-specific tokens that
facilitates the aggregation of multi-subject data without a linear increase of
parameters. Additionally, we employ representational similarity analysis (RSA)
to guide token representation learning based on the topological relationship of
visual stimuli in the representation space of CLIP, enabling full
characterization of the relationship between neural responses of different
subjects under different stimuli. Finally, token representations are used for
multi-subject semantic decoding. Our proposed method outperforms single-subject
decoding methods and achieves state-of-the-art performance among the existing
multi-subject methods on two fMRI datasets. Visualization results provide
insights into the effectiveness of our proposed method. Code is available at
https://github.com/CLIP-MUSED/CLIP-MUSED.
Related papers
- LLM4Brain: Training a Large Language Model for Brain Video Understanding [9.294352205183726]
We introduce an LLM-based approach for reconstructing visual-semantic information from fMRI signals elicited by video stimuli.
We employ fine-tuning techniques on an fMRI encoder equipped with adaptors to transform brain responses into latent representations aligned with the video stimuli.
In particular, we integrate self-supervised domain adaptation methods to enhance the alignment between visual-semantic information and brain responses.
arXiv Detail & Related papers (2024-09-26T15:57:08Z) - Single-Shared Network with Prior-Inspired Loss for Parameter-Efficient Multi-Modal Imaging Skin Lesion Classification [6.195015783344803]
We introduce a multi-modal approach that efficiently integrates multi-scale clinical and dermoscopy features within a single network.
Our method exhibits superiority in both accuracy and model parameters compared to currently advanced methods.
arXiv Detail & Related papers (2024-03-28T08:00:14Z) - Manipulating Feature Visualizations with Gradient Slingshots [54.31109240020007]
We introduce a novel method for manipulating Feature Visualization (FV) without significantly impacting the model's decision-making process.
We evaluate the effectiveness of our method on several neural network models and demonstrate its capabilities to hide the functionality of arbitrarily chosen neurons.
arXiv Detail & Related papers (2024-01-11T18:57:17Z) - Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement
Learning [53.00683059396803]
Mask image model (MIM) has been widely used due to its simplicity and effectiveness in recovering original information from masked images.
We propose a decision-based MIM that utilizes reinforcement learning (RL) to automatically search for optimal image masking ratio and masking strategy.
Our approach has a significant advantage over alternative self-supervised methods on the task of neuron segmentation.
arXiv Detail & Related papers (2023-10-06T10:40:46Z) - Convolutional neural network based on sparse graph attention mechanism
for MRI super-resolution [0.34410212782758043]
Medical image super-resolution (SR) reconstruction using deep learning techniques can enhance lesion analysis and assist doctors in improving diagnostic efficiency and accuracy.
Existing deep learning-based SR methods rely on convolutional neural networks (CNNs), which inherently limit the expressive capabilities of these models.
We propose an A-network that utilizes multiple convolution operator feature extraction modules (MCO) for extracting image features.
arXiv Detail & Related papers (2023-05-29T06:14:22Z) - Dynamic Latent Separation for Deep Learning [67.62190501599176]
A core problem in machine learning is to learn expressive latent variables for model prediction on complex data.
Here, we develop an approach that improves expressiveness, provides partial interpretation, and is not restricted to specific applications.
arXiv Detail & Related papers (2022-10-07T17:56:53Z) - Deep Co-Attention Network for Multi-View Subspace Learning [73.3450258002607]
We propose a deep co-attention network for multi-view subspace learning.
It aims to extract both the common information and the complementary information in an adversarial setting.
In particular, it uses a novel cross reconstruction loss and leverages the label information to guide the construction of the latent representation.
arXiv Detail & Related papers (2021-02-15T18:46:44Z) - Deep Representational Similarity Learning for analyzing neural
signatures in task-based fMRI dataset [81.02949933048332]
This paper develops Deep Representational Similarity Learning (DRSL), a deep extension of Representational Similarity Analysis (RSA)
DRSL is appropriate for analyzing similarities between various cognitive tasks in fMRI datasets with a large number of subjects.
arXiv Detail & Related papers (2020-09-28T18:30:14Z) - A shared neural encoding model for the prediction of subject-specific
fMRI response [17.020869686284165]
We propose a shared convolutional neural encoding method that accounts for individual-level differences.
Our method leverages multi-subject data to improve the prediction of subject-specific responses evoked by visual or auditory stimuli.
arXiv Detail & Related papers (2020-06-29T04:10:14Z) - Modeling Shared Responses in Neuroimaging Studies through MultiView ICA [94.31804763196116]
Group studies involving large cohorts of subjects are important to draw general conclusions about brain functional organization.
We propose a novel MultiView Independent Component Analysis model for group studies, where data from each subject are modeled as a linear combination of shared independent sources plus noise.
We demonstrate the usefulness of our approach first on fMRI data, where our model demonstrates improved sensitivity in identifying common sources among subjects.
arXiv Detail & Related papers (2020-06-11T17:29:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.