Related papers: Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication

Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication

URL: http://arxiv.org/abs/2510.27247v1
Date: Fri, 31 Oct 2025 07:31:13 GMT
Title: Reconstructing Unseen Sentences from Speech-related Biosignals for Open-vocabulary Neural Communication
Authors: Deok-Seon Kim, Seo-Hyun Lee, Kang Yin, Seong-Whan Lee,
Abstract summary: This study investigates the potential of speech synthesis for previously unseen sentences across various speech modes.<n>We leverage phoneme-level information extracted from high-density electroencephalography (EEG) signals, both independently and in conjunction with electromyography (EMG) signals.<n>Our findings underscore the feasibility of biosignal-based sentence-level speech synthesis for reconstructing unseen sentences.
Score: 45.424817836500175
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Brain-to-speech (BTS) systems represent a groundbreaking approach to human communication by enabling the direct transformation of neural activity into linguistic expressions. While recent non-invasive BTS studies have largely focused on decoding predefined words or sentences, achieving open-vocabulary neural communication comparable to natural human interaction requires decoding unconstrained speech. Additionally, effectively integrating diverse signals derived from speech is crucial for developing personalized and adaptive neural communication and rehabilitation solutions for patients. This study investigates the potential of speech synthesis for previously unseen sentences across various speech modes by leveraging phoneme-level information extracted from high-density electroencephalography (EEG) signals, both independently and in conjunction with electromyography (EMG) signals. Furthermore, we examine the properties affecting phoneme decoding accuracy during sentence reconstruction and offer neurophysiological insights to further enhance EEG decoding for more effective neural communication solutions. Our findings underscore the feasibility of biosignal-based sentence-level speech synthesis for reconstructing unseen sentences, highlighting a significant step toward developing open-vocabulary neural communication systems adapted to diverse patient needs and conditions. Additionally, this study provides meaningful insights into the development of communication and rehabilitation solutions utilizing EEG-based decoding technologies.

Related papers

Neural Decoding of Overt Speech from ECoG Using Vision Transformers and Contrastive Representation Learning [1.58476321728042]
Speech Brain Computer Interfaces offer promising solutions to people with severe paralysis unable to communicate.<n>Recent studies have demonstrated convincing reconstruction of intelligible speech from surface electrocorticographic (ECoG) or intracortical recordings.<n>We present an offline speech decoding pipeline based on an encoder-decoder deep neural architecture, integrating Vision Transformers and contrastive learning.
arXiv Detail & Related papers (2025-12-04T09:47:15Z)
WaveMind: Towards a Conversational EEG Foundation Model Aligned to Textual and Visual Modalities [55.00677513249723]
EEG signals simultaneously encode both cognitive processes and intrinsic neural states.<n>We map EEG signals and their corresponding modalities into a unified semantic space to achieve generalized interpretation.<n>The resulting model demonstrates robust classification accuracy while supporting flexible, open-ended conversations.
arXiv Detail & Related papers (2025-09-26T06:21:51Z)
Towards Inclusive Communication: A Unified Framework for Generating Spoken Language from Sign, Lip, and Audio [52.859261069569165]
We propose the first unified framework capable of handling diverse combinations of sign language, lip movements, and audio for spoken-language text generation.<n>We focus on three main objectives: (i) designing a unified, modality-agnostic architecture capable of effectively processing heterogeneous inputs; (ii) exploring the underexamined synergy among modalities, particularly the role of lip movements as non-manual cues in sign language comprehension; and (iii) achieving performance on par with or better than state-of-the-art models specialized for individual tasks.
arXiv Detail & Related papers (2025-08-28T06:51:42Z)
sEEG-based Encoding for Sentence Retrieval: A Contrastive Learning Approach to Brain-Language Alignment [8.466223794246261]
We present SSENSE, a contrastive learning framework that projects single-subject stereo-electroencephalography (sEEG) signals into the sentence embedding space of a frozen CLIP model.<n>We evaluate our method on time-aligned sEEG and spoken transcripts from a naturalistic movie-watching dataset.
arXiv Detail & Related papers (2025-04-20T03:01:42Z)
Bridging Brain Signals and Language: A Deep Learning Approach to EEG-to-Text Decoding [1.1655046053160683]
We introduce a special framework which changes conventional closed-vocabulary EEG-to-text decoding approaches.<n>This research aims to create a connection between open-vocabulary Text generation systems and human brain signal interpretation.
arXiv Detail & Related papers (2025-02-11T14:43:14Z)
Towards Dynamic Neural Communication and Speech Neuroprosthesis Based on Viseme Decoding [25.555303640695577]
Decoding text, speech, or images from human neural signals holds promising potential both as neuroprosthesis for patients and as innovative communication tools.<n>We developed a diffusion model-based framework to decode visual speech intentions from speech-related non-invasive brain signals.<n>We successfully reconstructed coherent lip movements, effectively bridging the gap between brain signals and dynamic visual interfaces.
arXiv Detail & Related papers (2025-01-09T04:47:27Z)
Geometry of orofacial neuromuscular signals: speech articulation decoding using surface electromyography [0.0]
We present data and methods for decoding speech articulations using surface electromyogram (EMG) signals.<n>EMG-based speech neuroprostheses offer a promising approach for restoring audible speech in individuals who have lost the ability to speak intelligibly.
arXiv Detail & Related papers (2024-11-04T20:31:22Z)
Neural Speech Embeddings for Speech Synthesis Based on Deep Generative Networks [27.64740032872726]
We introduce the current brain-to-speech technology with the possibility of speech synthesis from brain signals. Also, we perform comprehensive analysis on the neural features and neural speech embeddings underlying the neurophysiological activation while performing speech.
arXiv Detail & Related papers (2023-12-10T08:12:08Z)
Language Generation from Brain Recordings [68.97414452707103]
We propose a generative language BCI that utilizes the capacity of a large language model and a semantic brain decoder. The proposed model can generate coherent language sequences aligned with the semantic content of visual or auditory language stimuli. Our findings demonstrate the potential and feasibility of employing BCIs in direct language generation.
arXiv Detail & Related papers (2023-11-16T13:37:21Z)
Open Vocabulary Electroencephalography-To-Text Decoding and Zero-shot Sentiment Classification [78.120927891455]
State-of-the-art brain-to-text systems have achieved great success in decoding language directly from brain signals using neural networks. In this paper, we extend the problem to open vocabulary Electroencephalography(EEG)-To-Text Sequence-To-Sequence decoding and zero-shot sentence sentiment classification on natural reading tasks. Our model achieves a 40.1% BLEU-1 score on EEG-To-Text decoding and a 55.6% F1 score on zero-shot EEG-based ternary sentiment classification, which significantly outperforms supervised baselines.
arXiv Detail & Related papers (2021-12-05T21:57:22Z)
Silent Speech Interfaces for Speech Restoration: A Review [59.68902463890532]
Silent speech interface (SSI) research aims to provide alternative and augmentative communication methods for persons with severe speech disorders. SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication. Most present-day SSIs have only been validated in laboratory settings for healthy users.
arXiv Detail & Related papers (2020-09-04T11:05:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.