Related papers: On Creating A Brain-To-Text Decoder

On Creating A Brain-To-Text Decoder

URL: http://arxiv.org/abs/2501.06326v2
Date: Sun, 02 Feb 2025 21:32:41 GMT
Title: On Creating A Brain-To-Text Decoder
Authors: Zenon Lamprou, Yashar Moshfeghi,
Abstract summary: This paper centers on the application of raw electroencephalogram (EEG) signals for decoding human brain activity.<n>The investigation specifically scrutinizes the efficacy of brain-computer interfaces (BCI) in deciphering neural signals associated with speech production.
Score: 6.084958172018792
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Brain decoding has emerged as a rapidly advancing and extensively utilized technique within neuroscience. This paper centers on the application of raw electroencephalogram (EEG) signals for decoding human brain activity, offering a more expedited and efficient methodology for enhancing our understanding of the human brain. The investigation specifically scrutinizes the efficacy of brain-computer interfaces (BCI) in deciphering neural signals associated with speech production, with particular emphasis on the impact of vocabulary size, electrode density, and training data on the framework's performance. The study reveals the competitive word error rates (WERs) achievable on the Librispeech benchmark through pre-training on unlabelled data for speech processing. Furthermore, the study evaluates the efficacy of voice recognition under configurations with limited labeled data, surpassing previous state-of-the-art techniques while utilizing significantly fewer labels. Additionally, the research provides a comprehensive analysis of error patterns in voice recognition and the influence of model size and unlabelled training data. It underscores the significance of factors such as vocabulary size and electrode density in enhancing BCI performance, advocating for an increase in microelectrodes and refinement of language models.

Related papers

BrainStratify: Coarse-to-Fine Disentanglement of Intracranial Neural Dynamics [8.36470471250669]
Decoding speech directly from neural activity is a central goal in brain-computer interface (BCI) research.<n>In recent years, exciting advances have been made through the growing use of intracranial field potential recordings, such as stereo-ElectroEncephaloGraphy (sEEG) and ElectroCorticoGraphy (ECoG)<n>These neural signals capture rich population-level activity but present key challenges: (i) task-relevant neural signals are sparsely distributed across sEEG electrodes, and (ii) they are often entangled with task-irrelevant neural signals in both sEEG and ECo
arXiv Detail & Related papers (2025-05-26T19:36:39Z)
Decoding Phone Pairs from MEG Signals Across Speech Modalities [0.4054486015338004]
We investigated magnetoencephalography signals to decode phones from brain activity during speech production and perception tasks.<n>Our results demonstrate significantly higher decoding accuracy during speech production compared to passive listening and playback modalities.
arXiv Detail & Related papers (2025-05-21T10:31:34Z)
Learning Interpretable Representations Leads to Semantically Faithful EEG-to-Text Generation [52.51005875755718]
We focus on EEG-to-text decoding and address its hallucination issue through the lens of posterior collapse.<n>Acknowledging the underlying mismatch in information capacity between EEG and text, we reframe the decoding task as semantic summarization of core meanings.<n>Experiments on the public ZuCo dataset demonstrate that GLIM consistently generates fluent, EEG-grounded sentences.
arXiv Detail & Related papers (2025-05-21T05:29:55Z)
sEEG-based Encoding for Sentence Retrieval: A Contrastive Learning Approach to Brain-Language Alignment [8.466223794246261]
We present SSENSE, a contrastive learning framework that projects single-subject stereo-electroencephalography (sEEG) signals into the sentence embedding space of a frozen CLIP model. We evaluate our method on time-aligned sEEG and spoken transcripts from a naturalistic movie-watching dataset.
arXiv Detail & Related papers (2025-04-20T03:01:42Z)
Bridging Brain Signals and Language: A Deep Learning Approach to EEG-to-Text Decoding [1.1655046053160683]
We introduce a special framework which changes conventional closed-vocabulary EEG-to-text decoding approaches. This research aims to create a connection between open-vocabulary Text generation systems and human brain signal interpretation.
arXiv Detail & Related papers (2025-02-11T14:43:14Z)
BrainECHO: Semantic Brain Signal Decoding through Vector-Quantized Spectrogram Reconstruction for Whisper-Enhanced Text Generation [29.78480739360263]
We propose a new multi-stage strategy for semantic brain signal decoding via vEctor-quantized speCtrogram reconstruction. BrainECHO successively conducts: 1) autoencoding of the audio spectrogram; 2) Brain-audio latent space alignment; and 3) Semantic text generation via Whisper finetuning. BrainECHO outperforms state-of-the-art methods under the same data split settings on two widely accepted resources.
arXiv Detail & Related papers (2024-10-19T04:29:03Z)
Towards Linguistic Neural Representation Learning and Sentence Retrieval from Electroencephalogram Recordings [27.418738450536047]
We propose a two-step pipeline for converting EEG signals into sentences. We first confirm that word-level semantic information can be learned from EEG data recorded during natural reading. We employ a training-free retrieval method to retrieve sentences based on the predictions from the EEG encoder.
arXiv Detail & Related papers (2024-08-08T03:40:25Z)
Cross-Subject Data Splitting for Brain-to-Text Decoding [36.30024741795527]
We propose a cross-subject data splitting criterion for brain-to-text decoding on various types of cognitive dataset (fMRI, EEG) We undertake a comprehensive analysis on existing cross-subject data splitting strategies and prove that all these methods suffer from data leakage. The proposed cross-subject splitting method successfully addresses the data leakage problem and we re-evaluate some SOTA brain-to-text decoding models as baselines for further research.
arXiv Detail & Related papers (2023-12-18T07:22:39Z)
Language Generation from Brain Recordings [68.97414452707103]
We propose a generative language BCI that utilizes the capacity of a large language model and a semantic brain decoder. The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli. Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
arXiv Detail & Related papers (2023-11-16T13:37:21Z)
CausalDialogue: Modeling Utterance-level Causality in Conversations [83.03604651485327]
We have compiled and expanded upon a new dataset called CausalDialogue through crowd-sourcing. This dataset includes multiple cause-effect pairs within a directed acyclic graph (DAG) structure. We propose a causality-enhanced method called Exponential Average Treatment Effect (ExMATE) to enhance the impact of causality at the utterance level in training neural conversation models.
arXiv Detail & Related papers (2022-12-20T18:31:50Z)
Decoding speech perception from non-invasive brain recordings [48.46819575538446]
We introduce a model trained with contrastive-learning to decode self-supervised representations of perceived speech from non-invasive recordings. Our model can identify, from 3 seconds of MEG signals, the corresponding speech segment with up to 41% accuracy out of more than 1,000 distinct possibilities.
arXiv Detail & Related papers (2022-08-25T10:01:43Z)
Neural Language Models are not Born Equal to Fit Brain Data, but Training Helps [75.84770193489639]
We examine the impact of test loss, training corpus and model architecture on the prediction of functional Magnetic Resonance Imaging timecourses of participants listening to an audiobook. We find that untrained versions of each model already explain significant amount of signal in the brain by capturing similarity in brain responses across identical words. We suggest good practices for future studies aiming at explaining the human language system using neural language models.
arXiv Detail & Related papers (2022-07-07T15:37:17Z)
Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks. In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks. Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z)
CogAlign: Learning to Align Textual Neural Representations to Cognitive Language Processing Signals [60.921888445317705]
We propose a CogAlign approach to integrate cognitive language processing signals into natural language processing models. We show that CogAlign achieves significant improvements with multiple cognitive features over state-of-the-art models on public datasets.
arXiv Detail & Related papers (2021-06-10T07:10:25Z)
Emotional EEG Classification using Connectivity Features and Convolutional Neural Networks [81.74442855155843]
We introduce a new classification system that utilizes brain connectivity with a CNN and validate its effectiveness via the emotional video classification. The level of concentration of the brain connectivity related to the emotional property of the target video is correlated with classification performance.
arXiv Detail & Related papers (2021-01-18T13:28:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.