Deep Representation Learning for Open Vocabulary
Electroencephalography-to-Text Decoding
- URL: http://arxiv.org/abs/2312.09430v1
- Date: Wed, 15 Nov 2023 08:03:09 GMT
- Title: Deep Representation Learning for Open Vocabulary
Electroencephalography-to-Text Decoding
- Authors: Hamza Amrani, Daniela Micucci, Paolo Napoletano
- Abstract summary: We present an end-to-end deep learning framework for non-invasive brain recordings that brings modern representational learning approaches to neuroscience.
Our model achieves a BLEU-1 score of 42.75%, a ROUGE-1-F of 33.28%, and a BERTScore-F of 53.86%, outperforming the previous state-of-the-art methods by 3.38%, 8.43%, and 6.31%, respectively.
- Score: 6.014363449216054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Previous research has demonstrated the potential of using pre-trained
language models for decoding open vocabulary Electroencephalography (EEG)
signals captured through a non-invasive Brain-Computer Interface (BCI).
However, the impact of embedding EEG signals in the context of language models
and the effect of subjectivity, remain unexplored, leading to uncertainty about
the best approach to enhance decoding performance. Additionally, current
evaluation metrics used to assess decoding effectiveness are predominantly
syntactic and do not provide insights into the comprehensibility of the decoded
output for human understanding. We present an end-to-end deep learning
framework for non-invasive brain recordings that brings modern representational
learning approaches to neuroscience. Our proposal introduces the following
innovations: 1) an end-to-end deep learning architecture for open vocabulary
EEG decoding, incorporating a subject-dependent representation learning module
for raw EEG encoding, a BART language model, and a GPT-4 sentence refinement
module; 2) a more comprehensive sentence-level evaluation metric based on the
BERTScore; 3) an ablation study that analyses the contributions of each module
within our proposal, providing valuable insights for future research. We
evaluate our approach on two publicly available datasets, ZuCo v1.0 and v2.0,
comprising EEG recordings of 30 subjects engaged in natural reading tasks. Our
model achieves a BLEU-1 score of 42.75%, a ROUGE-1-F of 33.28%, and a
BERTScore-F of 53.86%, outperforming the previous state-of-the-art methods by
3.38%, 8.43%, and 6.31%, respectively.
Related papers
- Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs) [4.720913027054481]
This paper presents Thought2Text, which uses instruction-tuned Large Language Models (LLMs) fine-tuned with EEG data to achieve this goal.
Experiments on a public EEG dataset collected for six subjects with image stimuli demonstrate the efficacy of multimodal LLMs.
This approach marks a significant advancement towards portable, low-cost "thoughts-to-text" technology with potential applications in both neuroscience and natural language processing (NLP)
arXiv Detail & Related papers (2024-10-10T00:47:59Z) - Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings [27.418738450536047]
We propose a two-step pipeline for converting EEG signals into sentences.
We first confirm that word-level semantic information can be learned from EEG data recorded during natural reading.
We employ a training-free retrieval method to retrieve sentences based on the predictions from the EEG encoder.
arXiv Detail & Related papers (2024-08-08T03:40:25Z) - Review of Deep Representation Learning Techniques for Brain-Computer Interfaces and Recommendations [0.0]
This review synthesizes empirical findings from a collection of articles using deep representation learning techniques for BCI decoding.
Among the 81 articles finally reviewed in depth, our analysis reveals a predominance of 31 articles using autoencoders.
None of these have led to standard foundation models that are picked up by the BCI community.
arXiv Detail & Related papers (2024-05-17T14:00:11Z) - Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder [69.7813498468116]
We propose Contrastive EEG-Text Masked Autoencoder (CET-MAE), a novel model that orchestrates compound self-supervised learning across and within EEG and text.
We also develop a framework called E2T-PTR (EEG-to-Text decoding using Pretrained Transferable Representations) to decode text from EEG sequences.
arXiv Detail & Related papers (2024-02-27T11:45:21Z) - BELT:Bootstrapping Electroencephalography-to-Language Decoding and
Zero-Shot Sentiment Classification by Natural Language Supervision [31.382825932199935]
The proposed BELT method is a generic and efficient framework that bootstraps EEG representation learning.
With a large LM's capacity for understanding semantic information and zero-shot generalization, BELT utilizes large LMs trained on Internet-scale datasets.
We achieve state-of-the-art results on two featuring brain decoding tasks including the brain-to-language translation and zero-shot sentiment classification.
arXiv Detail & Related papers (2023-09-21T13:24:01Z) - Retrieval-based Disentangled Representation Learning with Natural
Language Supervision [61.75109410513864]
We present Vocabulary Disentangled Retrieval (VDR), a retrieval-based framework that harnesses natural language as proxies of the underlying data variation to drive disentangled representation learning.
Our approach employ a bi-encoder model to represent both data and natural language in a vocabulary space, enabling the model to distinguish intrinsic dimensions that capture characteristics within data through its natural language counterpart, thus disentanglement.
arXiv Detail & Related papers (2022-12-15T10:20:42Z) - Learning to Decompose Visual Features with Latent Textual Prompts [140.2117637223449]
We propose Decomposed Feature Prompting (DeFo) to improve vision-language models.
Our empirical study shows DeFo's significance in improving the vision-language models.
arXiv Detail & Related papers (2022-10-09T15:40:13Z) - Decoding speech perception from non-invasive brain recordings [48.46819575538446]
We introduce a model trained with contrastive-learning to decode self-supervised representations of perceived speech from non-invasive recordings.
Our model can identify, from 3 seconds of MEG signals, the corresponding speech segment with up to 41% accuracy out of more than 1,000 distinct possibilities.
arXiv Detail & Related papers (2022-08-25T10:01:43Z) - Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot
Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks.
In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks.
Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z) - Model-based analysis of brain activity reveals the hierarchy of language
in 305 subjects [82.81964713263483]
A popular approach to decompose the neural bases of language consists in correlating, across individuals, the brain responses to different stimuli.
Here, we show that a model-based approach can reach equivalent results within subjects exposed to natural stimuli.
arXiv Detail & Related papers (2021-10-12T15:30:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.