Decoding visual brain representations from electroencephalography
through Knowledge Distillation and latent diffusion models
- URL: http://arxiv.org/abs/2309.07149v1
- Date: Fri, 8 Sep 2023 09:13:50 GMT
- Title: Decoding visual brain representations from electroencephalography
through Knowledge Distillation and latent diffusion models
- Authors: Matteo Ferrante, Tommaso Boccato, Stefano Bargione, Nicola Toschi
- Abstract summary: We present an innovative method that employs to classify and reconstruct images from the ImageNet dataset using electroencephalography (EEG) data.
We analyzed EEG recordings from 6 participants, each exposed to 50 images spanning 40 unique semantic categories.
We incorporated an image reconstruction mechanism based on pre-trained latent diffusion models, which allowed us to generate an estimate of the images which had elicited EEG activity.
- Score: 0.12289361708127873
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Decoding visual representations from human brain activity has emerged as a
thriving research domain, particularly in the context of brain-computer
interfaces. Our study presents an innovative method that employs to classify
and reconstruct images from the ImageNet dataset using electroencephalography
(EEG) data from subjects that had viewed the images themselves (i.e. "brain
decoding"). We analyzed EEG recordings from 6 participants, each exposed to 50
images spanning 40 unique semantic categories. These EEG readings were
converted into spectrograms, which were then used to train a convolutional
neural network (CNN), integrated with a knowledge distillation procedure based
on a pre-trained Contrastive Language-Image Pre-Training (CLIP)-based image
classification teacher network. This strategy allowed our model to attain a
top-5 accuracy of 80%, significantly outperforming a standard CNN and various
RNN-based benchmarks. Additionally, we incorporated an image reconstruction
mechanism based on pre-trained latent diffusion models, which allowed us to
generate an estimate of the images which had elicited EEG activity. Therefore,
our architecture not only decodes images from neural activity but also offers a
credible image reconstruction from EEG only, paving the way for e.g. swift,
individualized feedback experiments. Our research represents a significant step
forward in connecting neural signals with visual cognition.
Related papers
- Mind's Eye: Image Recognition by EEG via Multimodal Similarity-Keeping Contrastive Learning [2.087148326341881]
This paper introduces a MUltimodal Similarity-keeping contrastivE learning framework for zero-shot EEG-based image classification.
We develop a series of multivariate time-series encoders tailored for EEG signals and assess the efficacy of regularized contrastive EEG-Image pretraining.
Our method achieves state-of-the-art performance, with a top-1 accuracy of 19.3% and a top-5 accuracy of 48.8% in 200-way zero-shot image classification.
arXiv Detail & Related papers (2024-06-05T16:42:23Z) - Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models [49.3179290313959]
The proposed method, emotion-centered generative replay (ECgr), tackles this challenge by integrating synthetic images from generative adversarial networks.
ECgr incorporates a quality assurance algorithm to ensure the fidelity of generated images.
The experimental results on four diverse facial expression datasets demonstrate that incorporating images generated by our pseudo-rehearsal method enhances training on the targeted dataset and the source dataset.
arXiv Detail & Related papers (2024-04-18T15:28:34Z) - Learning Robust Deep Visual Representations from EEG Brain Recordings [13.768240137063428]
This study proposes a two-stage method where the first step is to obtain EEG-derived features for robust learning of deep representations.
We demonstrate the generalizability of our feature extraction pipeline across three different datasets using deep-learning architectures.
We propose a novel framework to transform unseen images into the EEG space and reconstruct them with approximation.
arXiv Detail & Related papers (2023-10-25T10:26:07Z) - Disruptive Autoencoders: Leveraging Low-level features for 3D Medical
Image Pre-training [51.16994853817024]
This work focuses on designing an effective pre-training framework for 3D radiology images.
We introduce Disruptive Autoencoders, a pre-training framework that attempts to reconstruct the original image from disruptions created by a combination of local masking and low-level perturbations.
The proposed pre-training framework is tested across multiple downstream tasks and achieves state-of-the-art performance.
arXiv Detail & Related papers (2023-07-31T17:59:42Z) - Seeing through the Brain: Image Reconstruction of Visual Perception from
Human Brain Signals [27.92796103924193]
We propose a comprehensive pipeline, named NeuroImagen, for reconstructing visual stimuli images from EEG signals.
We incorporate a novel multi-level perceptual information decoding to draw multi-grained outputs from the given EEG data.
arXiv Detail & Related papers (2023-07-27T12:54:16Z) - Controllable Mind Visual Diffusion Model [58.83896307930354]
Brain signal visualization has emerged as an active research area, serving as a critical interface between the human visual system and computer vision models.
We propose a novel approach, referred to as Controllable Mind Visual Model Diffusion (CMVDM)
CMVDM extracts semantic and silhouette information from fMRI data using attribute alignment and assistant networks.
We then leverage a control model to fully exploit the extracted information for image synthesis, resulting in generated images that closely resemble the visual stimuli in terms of semantics and silhouette.
arXiv Detail & Related papers (2023-05-17T11:36:40Z) - Brain informed transfer learning for categorizing construction hazards [0.0]
This work is a step toward improving machine learning algorithms by learning from human-brain signals recorded via a commercially available brain-computer interface.
More generalized visual recognition systems can be effectively developed based on this approach of "keep human in the loop"
arXiv Detail & Related papers (2022-11-17T19:41:04Z) - Adapting Brain-Like Neural Networks for Modeling Cortical Visual
Prostheses [68.96380145211093]
Cortical prostheses are devices implanted in the visual cortex that attempt to restore lost vision by electrically stimulating neurons.
Currently, the vision provided by these devices is limited, and accurately predicting the visual percepts resulting from stimulation is an open challenge.
We propose to address this challenge by utilizing 'brain-like' convolutional neural networks (CNNs), which have emerged as promising models of the visual system.
arXiv Detail & Related papers (2022-09-27T17:33:19Z) - NAS-DIP: Learning Deep Image Prior with Neural Architecture Search [65.79109790446257]
Recent work has shown that the structure of deep convolutional neural networks can be used as a structured image prior.
We propose to search for neural architectures that capture stronger image priors.
We search for an improved network by leveraging an existing neural architecture search algorithm.
arXiv Detail & Related papers (2020-08-26T17:59:36Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.