Mind's Eye: Image Recognition by EEG via Multimodal Similarity-Keeping Contrastive Learning
- URL: http://arxiv.org/abs/2406.16910v1
- Date: Wed, 5 Jun 2024 16:42:23 GMT
- Title: Mind's Eye: Image Recognition by EEG via Multimodal Similarity-Keeping Contrastive Learning
- Authors: Chi-Sheng Chen, Chun-Shu Wei,
- Abstract summary: This paper introduces a MUltimodal Similarity-keeping contrastivE learning framework for zero-shot EEG-based image classification.
We develop a series of multivariate time-series encoders tailored for EEG signals and assess the efficacy of regularized contrastive EEG-Image pretraining.
Our method achieves state-of-the-art performance, with a top-1 accuracy of 19.3% and a top-5 accuracy of 48.8% in 200-way zero-shot image classification.
- Score: 2.087148326341881
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decoding images from non-invasive electroencephalographic (EEG) signals has been a grand challenge in understanding how the human brain process visual information in real-world scenarios. To cope with the issues of signal-to-noise ratio and nonstationarity, this paper introduces a MUltimodal Similarity-keeping contrastivE learning (MUSE) framework for zero-shot EEG-based image classification. We develop a series of multivariate time-series encoders tailored for EEG signals and assess the efficacy of regularized contrastive EEG-Image pretraining using an extensive visual EEG dataset. Our method achieves state-of-the-art performance, with a top-1 accuracy of 19.3% and a top-5 accuracy of 48.8% in 200-way zero-shot image classification. Furthermore, we visualize neural patterns via model interpretation, shedding light on the visual processing dynamics in the human brain. The code repository for this work is available at: https://github.com/ChiShengChen/MUSE_EEG.
Related papers
- Autoregressive Sequence Modeling for 3D Medical Image Representation [48.706230961589924]
We introduce a pioneering method for learning 3D medical image representations through an autoregressive sequence pre-training framework.
Our approach various 3D medical images based on spatial, contrast, and semantic correlations, treating them as interconnected visual tokens within a token sequence.
arXiv Detail & Related papers (2024-09-13T10:19:10Z) - EEG-ImageNet: An Electroencephalogram Dataset and Benchmarks with Image Visual Stimuli of Multi-Granularity Labels [12.783945503890962]
We introduce EEG-ImageNet, a novel EEG dataset comprising recordings from 16 subjects exposed to 4000 images selected from the ImageNet dataset.
EEG-ImageNet consists of 5 times EEG-image pairs larger than existing similar EEG benchmarks.
Based on it, we establish benchmarks for object classification and image reconstruction. Experiments with several commonly used models show that the best models can achieve object classification with accuracy around 60% and image reconstruction with two-way identification around 64%.
arXiv Detail & Related papers (2024-06-11T10:52:17Z) - Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models [49.3179290313959]
The proposed method, emotion-centered generative replay (ECgr), tackles this challenge by integrating synthetic images from generative adversarial networks.
ECgr incorporates a quality assurance algorithm to ensure the fidelity of generated images.
The experimental results on four diverse facial expression datasets demonstrate that incorporating images generated by our pseudo-rehearsal method enhances training on the targeted dataset and the source dataset.
arXiv Detail & Related papers (2024-04-18T15:28:34Z) - Learning Robust Deep Visual Representations from EEG Brain Recordings [13.768240137063428]
This study proposes a two-stage method where the first step is to obtain EEG-derived features for robust learning of deep representations.
We demonstrate the generalizability of our feature extraction pipeline across three different datasets using deep-learning architectures.
We propose a novel framework to transform unseen images into the EEG space and reconstruct them with approximation.
arXiv Detail & Related papers (2023-10-25T10:26:07Z) - A Knowledge-Driven Cross-view Contrastive Learning for EEG
Representation [48.85731427874065]
This paper proposes a knowledge-driven cross-view contrastive learning framework (KDC2) to extract effective representations from EEG with limited labels.
The KDC2 method creates scalp and neural views of EEG signals, simulating the internal and external representation of brain activity.
By modeling prior neural knowledge based on neural information consistency theory, the proposed method extracts invariant and complementary neural knowledge to generate combined representations.
arXiv Detail & Related papers (2023-09-21T08:53:51Z) - Decoding visual brain representations from electroencephalography
through Knowledge Distillation and latent diffusion models [0.12289361708127873]
We present an innovative method that employs to classify and reconstruct images from the ImageNet dataset using electroencephalography (EEG) data.
We analyzed EEG recordings from 6 participants, each exposed to 50 images spanning 40 unique semantic categories.
We incorporated an image reconstruction mechanism based on pre-trained latent diffusion models, which allowed us to generate an estimate of the images which had elicited EEG activity.
arXiv Detail & Related papers (2023-09-08T09:13:50Z) - Decoding Natural Images from EEG for Object Recognition [8.411976038504589]
This paper presents a self-supervised framework to demonstrate the feasibility of learning image representations from EEG signals.
We achieve a top-1 accuracy of 15.6% and a top-5 accuracy of 42.8% in challenging 200-way zero-shot tasks.
These findings yield valuable insights for neural decoding and brain-computer interfaces in real-world scenarios.
arXiv Detail & Related papers (2023-08-25T08:05:37Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - Multi-Domain Norm-referenced Encoding Enables Data Efficient Transfer
Learning of Facial Expression Recognition [62.997667081978825]
We propose a biologically-inspired mechanism for transfer learning in facial expression recognition.
Our proposed architecture provides an explanation for how the human brain might innately recognize facial expressions on varying head shapes.
Our model achieves a classification accuracy of 92.15% on the FERG dataset with extreme data efficiency.
arXiv Detail & Related papers (2023-04-05T09:06:30Z) - EEG-based Image Feature Extraction for Visual Classification using Deep
Learning [0.0]
We develop an efficient way of encoding EEG signals as images to facilitate a more subtle understanding of brain signals with deep learning models.
Our image classification approach with combined EEG features achieved an accuracy of 82% compared to the slightly better accuracy of a pure deep learning approach.
arXiv Detail & Related papers (2022-09-27T00:50:56Z) - Exploring CLIP for Assessing the Look and Feel of Images [87.97623543523858]
We introduce Contrastive Language-Image Pre-training (CLIP) models for assessing both the quality perception (look) and abstract perception (feel) of images in a zero-shot manner.
Our results show that CLIP captures meaningful priors that generalize well to different perceptual assessments.
arXiv Detail & Related papers (2022-07-25T17:58:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.