MAD: Multi-Alignment MEG-to-Text Decoding
- URL: http://arxiv.org/abs/2406.01512v1
- Date: Mon, 3 Jun 2024 16:43:10 GMT
- Title: MAD: Multi-Alignment MEG-to-Text Decoding
- Authors: Yiqian Yang, Hyejeong Jo, Yiqun Duan, Qiang Zhang, Jinni Zhou, Won Hee Lee, Renjing Xu, Hui Xiong,
- Abstract summary: We present a novel approach for translating MEG signals into text using a speech-decoding framework with multiple alignments.
We achieve an impressive BLEU-1 score on the $textitGWilliams$ dataset, significantly outperforming the baseline from 5.49 to 10.44 on the BLEU-1 metric.
- Score: 21.155031900491654
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deciphering language from brain activity is a crucial task in brain-computer interface (BCI) research. Non-invasive cerebral signaling techniques including electroencephalography (EEG) and magnetoencephalography (MEG) are becoming increasingly popular due to their safety and practicality, avoiding invasive electrode implantation. However, current works under-investigated three points: 1) a predominant focus on EEG with limited exploration of MEG, which provides superior signal quality; 2) poor performance on unseen text, indicating the need for models that can better generalize to diverse linguistic contexts; 3) insufficient integration of information from other modalities, which could potentially constrain our capacity to comprehensively understand the intricate dynamics of brain activity. This study presents a novel approach for translating MEG signals into text using a speech-decoding framework with multiple alignments. Our method is the first to introduce an end-to-end multi-alignment framework for totally unseen text generation directly from MEG signals. We achieve an impressive BLEU-1 score on the $\textit{GWilliams}$ dataset, significantly outperforming the baseline from 5.49 to 10.44 on the BLEU-1 metric. This improvement demonstrates the advancement of our model towards real-world applications and underscores its potential in advancing BCI research. Code is available at $\href{https://github.com/NeuSpeech/MAD-MEG2text}{https://github.com/NeuSpeech/MAD-MEG2text}$.
Related papers
- NeuGPT: Unified multi-modal Neural GPT [48.70587003475798]
NeuGPT is a groundbreaking multi-modal language generation model designed to harmonize the fragmented landscape of neural recording research.
Our model mainly focus on brain-to-text decoding, improving SOTA from 6.94 to 12.92 on BLEU-1 and 6.93 to 13.06 on ROUGE-1F.
It can also simulate brain signals, thereby serving as a novel neural interface.
arXiv Detail & Related papers (2024-10-28T10:53:22Z) - BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation [29.78480739360263]
We propose a new multi-stage strategy for semantic brain signal decoding via vEctor-quantized speCtrogram reconstruction.
BrainECHO successively conducts: 1) autoencoding of the audio spectrogram; 2) Brain-audio latent space alignment; and 3) Semantic text generation via Whisper finetuning.
BrainECHO outperforms state-of-the-art methods under the same data split settings on two widely accepted resources.
arXiv Detail & Related papers (2024-10-19T04:29:03Z) - BELT-2: Bootstrapping EEG-to-Language representation alignment for multi-task brain decoding [24.54436986074267]
We introduce BELT-2, a pioneering multi-task model designed to enhance both encoding and decoding performance from EEG signals.
BELT-2 is the first work to innovatively 1) adopt byte-pair encoding (BPE)-level EEG-language alignment and 2) integrate multi-task training and decoding in the EEG domain.
These innovative efforts make BELT-2 a pioneering breakthrough, making it the first work in the field capable of decoding coherent and readable sentences from non-invasive brain signals.
arXiv Detail & Related papers (2024-08-28T12:30:22Z) - NeuSpeech: Decode Neural signal as Speech [21.155031900491654]
We explore the brain-to-text translation of MEG signals in a speech-decoding formation.
Our model achieves impressive BLEU-1 scores of 60.30 and 52.89 without pretraining $&$ teacher-forcing.
arXiv Detail & Related papers (2024-03-04T05:55:01Z) - Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder [69.7813498468116]
We propose Contrastive EEG-Text Masked Autoencoder (CET-MAE), a novel model that orchestrates compound self-supervised learning across and within EEG and text.
We also develop a framework called E2T-PTR (EEG-to-Text decoding using Pretrained Transferable Representations) to decode text from EEG sequences.
arXiv Detail & Related papers (2024-02-27T11:45:21Z) - BELT:Bootstrapping Electroencephalography-to-Language Decoding and
Zero-Shot Sentiment Classification by Natural Language Supervision [31.382825932199935]
The proposed BELT method is a generic and efficient framework that bootstraps EEG representation learning.
With a large LM's capacity for understanding semantic information and zero-shot generalization, BELT utilizes large LMs trained on Internet-scale datasets.
We achieve state-of-the-art results on two featuring brain decoding tasks including the brain-to-language translation and zero-shot sentiment classification.
arXiv Detail & Related papers (2023-09-21T13:24:01Z) - Probing Brain Context-Sensitivity with Masked-Attention Generation [87.31930367845125]
We use GPT-2 transformers to generate word embeddings that capture a fixed amount of contextual information.
We then tested whether these embeddings could predict fMRI brain activity in humans listening to naturalistic text.
arXiv Detail & Related papers (2023-05-23T09:36:21Z) - Deep Bidirectional Language-Knowledge Graph Pretraining [159.9645181522436]
DRAGON is a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale.
Our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities.
arXiv Detail & Related papers (2022-10-17T18:02:52Z) - Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot
Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks.
In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks.
Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z) - Explicit Alignment Objectives for Multilingual Bidirectional Encoders [111.65322283420805]
We present a new method for learning multilingual encoders, AMBER (Aligned Multilingual Bi-directional EncodeR)
AMBER is trained on additional parallel data using two explicit alignment objectives that align the multilingual representations at different granularities.
Experimental results show that AMBER obtains gains of up to 1.1 average F1 score on sequence tagging and up to 27.3 average accuracy on retrieval over the XLMR-large model.
arXiv Detail & Related papers (2020-10-15T18:34:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.