Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match
vs. Mismatch Classification
- URL: http://arxiv.org/abs/2309.04153v2
- Date: Wed, 22 Nov 2023 01:36:14 GMT
- Title: Mapping EEG Signals to Visual Stimuli: A Deep Learning Approach to Match
vs. Mismatch Classification
- Authors: Yiqian Yang, Zhengqiao Zhao, Qian Wang, Yan Yang, Jingdong Chen
- Abstract summary: We propose a "match-vs-mismatch" deep learning model to classify whether a video clip induces excitatory responses in recorded EEG signals.
We demonstrate that the proposed model is able to achieve the highest accuracy on unseen subjects.
These results have the potential to facilitate the development of neural recording-based video reconstruction.
- Score: 28.186129896907694
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing approaches to modeling associations between visual stimuli and brain
responses are facing difficulties in handling between-subject variance and
model generalization. Inspired by the recent progress in modeling speech-brain
response, we propose in this work a "match-vs-mismatch" deep learning model to
classify whether a video clip induces excitatory responses in recorded EEG
signals and learn associations between the visual content and corresponding
neural recordings. Using an exclusive experimental dataset, we demonstrate that
the proposed model is able to achieve the highest accuracy on unseen subjects
as compared to other baseline models. Furthermore, we analyze the inter-subject
noise using a subject-level silhouette score in the embedding space and show
that the developed model is able to mitigate inter-subject noise and
significantly reduce the silhouette score. Moreover, we examine the Grad-CAM
activation score and show that the brain regions associated with language
processing contribute most to the model predictions, followed by regions
associated with visual processing. These results have the potential to
facilitate the development of neural recording-based video reconstruction and
its related applications.
Related papers
- Human-Object Interaction Detection Collaborated with Large Relation-driven Diffusion Models [65.82564074712836]
We introduce DIFfusionHOI, a new HOI detector shedding light on text-to-image diffusion models.
We first devise an inversion-based strategy to learn the expression of relation patterns between humans and objects in embedding space.
These learned relation embeddings then serve as textual prompts, to steer diffusion models generate images that depict specific interactions.
arXiv Detail & Related papers (2024-10-26T12:00:33Z) - Visual Neural Decoding via Improved Visual-EEG Semantic Consistency [3.4061238650474657]
Methods that directly map EEG features to the CLIP embedding space may introduce mapping bias and cause semantic inconsistency.
We propose a Visual-EEG Semantic Decouple Framework that explicitly extracts the semantic-related features of these two modalities to facilitate optimal alignment.
Our method achieves state-of-the-art results in zero-shot neural decoding tasks.
arXiv Detail & Related papers (2024-08-13T10:16:10Z) - Investigating the Timescales of Language Processing with EEG and Language Models [0.0]
This study explores the temporal dynamics of language processing by examining the alignment between word representations from a pre-trained language model and EEG data.
Using a Temporal Response Function (TRF) model, we investigate how neural activity corresponds to model representations across different layers.
Our analysis reveals patterns in TRFs from distinct layers, highlighting varying contributions to lexical and compositional processing.
arXiv Detail & Related papers (2024-06-28T12:49:27Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - Self-supervised models of audio effectively explain human cortical
responses to speech [71.57870452667369]
We capitalize on the progress of self-supervised speech representation learning to create new state-of-the-art models of the human auditory system.
We show that these results show that self-supervised models effectively capture the hierarchy of information relevant to different stages of speech processing in human cortex.
arXiv Detail & Related papers (2022-05-27T22:04:02Z) - Continuous-Time Audiovisual Fusion with Recurrence vs. Attention for
In-The-Wild Affect Recognition [4.14099371030604]
We present our submission to the 3rd Affective Behavior Analysis in-the-wild (ABAW) challenge.
Recurrence and attention are the two widely used sequence modelling mechanisms in the literature.
We show that LSTM-RNNs can outperform the attention models when coupled with low-complex CNN backbones.
arXiv Detail & Related papers (2022-03-24T18:22:56Z) - Mixed Effects Neural ODE: A Variational Approximation for Analyzing the
Dynamics of Panel Data [50.23363975709122]
We propose a probabilistic model called ME-NODE to incorporate (fixed + random) mixed effects for analyzing panel data.
We show that our model can be derived using smooth approximations of SDEs provided by the Wong-Zakai theorem.
We then derive Evidence Based Lower Bounds for ME-NODE, and develop (efficient) training algorithms.
arXiv Detail & Related papers (2022-02-18T22:41:51Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - Correlation based Multi-phasal models for improved imagined speech EEG
recognition [22.196642357767338]
This work aims to profit from the parallel information contained in multi-phasal EEG data recorded while speaking, imagining and performing articulatory movements corresponding to specific speech units.
A bi-phase common representation learning module using neural networks is designed to model the correlation and between an analysis phase and a support phase.
The proposed approach further handles the non-availability of multi-phasal data during decoding.
arXiv Detail & Related papers (2020-11-04T09:39:53Z) - A shared neural encoding model for the prediction of subject-specific
fMRI response [17.020869686284165]
We propose a shared convolutional neural encoding method that accounts for individual-level differences.
Our method leverages multi-subject data to improve the prediction of subject-specific responses evoked by visual or auditory stimuli.
arXiv Detail & Related papers (2020-06-29T04:10:14Z) - Object Relational Graph with Teacher-Recommended Learning for Video
Captioning [92.48299156867664]
We propose a complete video captioning system including both a novel model and an effective training strategy.
Specifically, we propose an object relational graph (ORG) based encoder, which captures more detailed interaction features to enrich visual representation.
Meanwhile, we design a teacher-recommended learning (TRL) method to make full use of the successful external language model (ELM) to integrate the abundant linguistic knowledge into the caption model.
arXiv Detail & Related papers (2020-02-26T15:34:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.