Category-aware EEG image generation based on wavelet transform and contrast semantic loss
- URL: http://arxiv.org/abs/2505.24301v1
- Date: Fri, 30 May 2025 07:24:58 GMT
- Title: Category-aware EEG image generation based on wavelet transform and contrast semantic loss
- Authors: Enshang Zhang, Zhicheng Zhang, Takashi Hanakawa,
- Abstract summary: We propose a transformer-based EEG signal encoder integrating the Discrete Wavelet Transform (DWT) and the gating mechanism.<n> Guided by the feature alignment and category-aware fusion losses, this encoder is used to extract features related to visual stimuli from EEG signals.<n>With the aid of a pre-trained diffusion model, these features are reconstructed into visual stimuli.
- Score: 4.165508411354963
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reconstructing visual stimuli from EEG signals is a crucial step in realizing brain-computer interfaces. In this paper, we propose a transformer-based EEG signal encoder integrating the Discrete Wavelet Transform (DWT) and the gating mechanism. Guided by the feature alignment and category-aware fusion losses, this encoder is used to extract features related to visual stimuli from EEG signals. Subsequently, with the aid of a pre-trained diffusion model, these features are reconstructed into visual stimuli. To verify the effectiveness of the model, we conducted EEG-to-image generation and classification tasks using the THINGS-EEG dataset. To address the limitations of quantitative analysis at the semantic level, we combined WordNet-based classification and semantic similarity metrics to propose a novel semantic-based score, emphasizing the ability of our model to transfer neural activities into visual representations. Experimental results show that our model significantly improves semantic alignment and classification accuracy, which achieves a maximum single-subject accuracy of 43\%, outperforming other state-of-the-art methods. The source code and supplementary material is available at https://github.com/zes0v0inn/DWT_EEG_Reconstruction/tree/main.
Related papers
- BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals [50.76802709706976]
This paper proposes Brain Omni, the first brain foundation model that generalises across heterogeneous EEG and MEG recordings.<n>To unify diverse data sources, we introduce BrainTokenizer, the first tokenizer that quantises neural brain activity into discrete representations.<n>A total of 1,997 hours of EEG and 656 hours of MEG data are curated and standardised from publicly available sources for pretraining.
arXiv Detail & Related papers (2025-05-18T14:07:14Z) - CognitionCapturer: Decoding Visual Stimuli From Human EEG Signal With Multimodal Information [61.1904164368732]
We propose CognitionCapturer, a unified framework that fully leverages multimodal data to represent EEG signals.<n>Specifically, CognitionCapturer trains Modality Experts for each modality to extract cross-modal information from the EEG modality.<n>The framework does not require any fine-tuning of the generative models and can be extended to incorporate more modalities.
arXiv Detail & Related papers (2024-12-13T16:27:54Z) - NECOMIMI: Neural-Cognitive Multimodal EEG-informed Image Generation with Diffusion Models [0.0]
NECOMIMI introduces a novel framework for generating images directly from EEG signals using advanced diffusion models.
The proposed NERV EEG encoder demonstrates state-of-the-art (SoTA) performance across multiple zero-shot classification tasks.
We introduce the CAT Score as a new metric tailored for EEG-to-image evaluation and establish a benchmark on the ThingsEEG dataset.
arXiv Detail & Related papers (2024-10-01T14:05:30Z) - CSLP-AE: A Contrastive Split-Latent Permutation Autoencoder Framework
for Zero-Shot Electroencephalography Signal Conversion [49.1574468325115]
A key aim in EEG analysis is to extract the underlying neural activation (content) as well as to account for the individual subject variability (style)
Inspired by recent advancements in voice conversion technologies, we propose a novel contrastive split-latent permutation autoencoder (CSLP-AE) framework that directly optimize for EEG conversion.
arXiv Detail & Related papers (2023-11-13T22:46:43Z) - DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial
Attention Detection [49.196182908826565]
Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment.
Current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images.
This paper proposes a dynamical graph self-distillation (DGSD) approach for AAD, which does not require speech stimuli as input.
arXiv Detail & Related papers (2023-09-07T13:43:46Z) - A Hybrid End-to-End Spatio-Temporal Attention Neural Network with
Graph-Smooth Signals for EEG Emotion Recognition [1.6328866317851187]
We introduce a deep neural network that acquires interpretable representations by a hybrid structure of network-temporal encoding and recurrent attention blocks.
We demonstrate that our proposed architecture exceeds state-of-the-art results for emotion classification on the publicly available DEAP dataset.
arXiv Detail & Related papers (2023-07-06T15:35:14Z) - Semantic Image Synthesis via Diffusion Models [174.24523061460704]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.<n>Recent work on semantic image synthesis mainly follows the de facto GAN-based approaches.<n>We propose a novel framework based on DDPM for semantic image synthesis.
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Hybrid Routing Transformer for Zero-Shot Learning [83.64532548391]
This paper presents a novel transformer encoder-decoder model, called hybrid routing transformer (HRT)
We embed an active attention, which is constructed by both the bottom-up and the top-down dynamic routing pathways to generate the attribute-aligned visual feature.
While in HRT decoder, we use static routing to calculate the correlation among the attribute-aligned visual features, the corresponding attribute semantics, and the class attribute vectors to generate the final class label predictions.
arXiv Detail & Related papers (2022-03-29T07:55:08Z) - GANSER: A Self-supervised Data Augmentation Framework for EEG-based
Emotion Recognition [15.812231441367022]
We propose a novel data augmentation framework, namely Generative Adversarial Network-based Self-supervised Data Augmentation (GANSER)
As the first to combine adversarial training with self-supervised learning for EEG-based emotion recognition, the proposed framework can generate high-quality simulated EEG samples.
A transformation function is employed to mask parts of EEG signals and force the generator to synthesize potential EEG signals based on the remaining parts.
arXiv Detail & Related papers (2021-09-07T14:42:55Z) - EEG-ConvTransformer for Single-Trial EEG based Visual Stimuli
Classification [5.076419064097734]
This work introduces an EEG-ConvTranformer network that is based on multi-headed self-attention.
It achieves improved classification accuracy over the state-of-the-art techniques across five different visual stimuli classification tasks.
arXiv Detail & Related papers (2021-07-08T17:22:04Z) - ScalingNet: extracting features from raw EEG data for emotion
recognition [4.047737925426405]
We propose a novel convolutional layer allowing to adaptively extract effective data-driven spectrogram-like features from raw EEG signals.
The proposed neural network architecture based on the scaling layer, references as ScalingNet, has achieved the state-of-the-art result across the established DEAP benchmark dataset.
arXiv Detail & Related papers (2021-02-07T08:54:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.