The "Sound of Silence" in EEG -- Cognitive voice activity detection
- URL: http://arxiv.org/abs/2010.05497v1
- Date: Mon, 12 Oct 2020 07:47:36 GMT
- Title: The "Sound of Silence" in EEG -- Cognitive voice activity detection
- Authors: Rini A Sharon, Hema A Murthy
- Abstract summary: "Non-speech"(NS) state of brain activity corresponding to silence regions of speech audio is studied.
Speech perception is studied to inspect the existence of such a state, followed by its identification in speech imagination.
The recognition performance and the visual distinction observed demonstrates the existence of silence signatures in EEG.
- Score: 22.196642357767338
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Speech cognition bears potential application as a brain computer interface
that can improve the quality of life for the otherwise communication impaired
people. While speech and resting state EEG are popularly studied, here we
attempt to explore a "non-speech"(NS) state of brain activity corresponding to
the silence regions of speech audio. Firstly, speech perception is studied to
inspect the existence of such a state, followed by its identification in speech
imagination. Analogous to how voice activity detection is employed to enhance
the performance of speech recognition, the EEG state activity detection
protocol implemented here is applied to boost the confidence of imagined speech
EEG decoding. Classification of speech and NS state is done using two datasets
collected from laboratory-based and commercial-based devices. The state
sequential information thus obtained is further utilized to reduce the search
space of imagined EEG unit recognition. Temporal signal structures and
topographic maps of NS states are visualized across subjects and sessions. The
recognition performance and the visual distinction observed demonstrates the
existence of silence signatures in EEG.
Related papers
- Continuous Modeling of the Denoising Process for Speech Enhancement
Based on Deep Learning [61.787485727134424]
We use a state variable to indicate the denoising process.
A UNet-like neural network learns to estimate every state variable sampled from the continuous denoising process.
Experimental results indicate that preserving a small amount of noise in the clean target benefits speech enhancement.
arXiv Detail & Related papers (2023-09-17T13:27:11Z) - Inner speech recognition through electroencephalographic signals [2.578242050187029]
This work focuses on inner speech recognition starting from EEG signals.
The decoding of the EEG into text should be understood as the classification of a limited number of words (commands)
Speech-related BCIs provide effective vocal communication strategies for controlling devices through speech commands interpreted from brain signals.
arXiv Detail & Related papers (2022-10-11T08:29:12Z) - Direction-Aware Joint Adaptation of Neural Speech Enhancement and
Recognition in Real Multiparty Conversational Environments [21.493664174262737]
This paper describes noisy speech recognition for an augmented reality headset that helps verbal communication within real multiparty conversational environments.
We propose a semi-supervised adaptation method that jointly updates the mask estimator and the ASR model at run-time using clean speech signals with ground-truth transcriptions and noisy speech signals with highly-confident estimated transcriptions.
arXiv Detail & Related papers (2022-07-15T03:43:35Z) - Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement
by Re-Synthesis [67.73554826428762]
We propose a novel audio-visual speech enhancement framework for high-fidelity telecommunications in AR/VR.
Our approach leverages audio-visual speech cues to generate the codes of a neural speech, enabling efficient synthesis of clean, realistic speech from noisy signals.
arXiv Detail & Related papers (2022-03-31T17:57:10Z) - Learning Audio-Visual Dereverberation [87.52880019747435]
Reverberation from audio reflecting off surfaces and objects in the environment not only degrades the quality of speech for human perception, but also severely impacts the accuracy of automatic speech recognition.
Our idea is to learn to dereverberate speech from audio-visual observations.
We introduce Visually-Informed Dereverberation of Audio (VIDA), an end-to-end approach that learns to remove reverberation based on both the observed sounds and visual scene.
arXiv Detail & Related papers (2021-06-14T20:01:24Z) - Brain Signals to Rescue Aphasia, Apraxia and Dysarthria Speech
Recognition [14.544989316741091]
We propose a deep learning-based algorithm to improve the performance of automatic speech recognition systems for aphasia, apraxia, and dysarthria speech.
We demonstrate a significant decoding performance improvement by more than 50% during test time for isolated speech recognition task.
Results show the first step towards demonstrating the possibility of utilizing non-invasive neural signals to design a real-time robust speech prosthetic for stroke survivors recovering from aphasia, apraxia, and dysarthria.
arXiv Detail & Related papers (2021-02-28T03:27:02Z) - Speech Enhancement for Wake-Up-Word detection in Voice Assistants [60.103753056973815]
Keywords spotting and in particular Wake-Up-Word (WUW) detection is a very important task for voice assistants.
This paper proposes a Speech Enhancement model adapted to the task of WUW detection.
It aims at increasing the recognition rate and reducing the false alarms in the presence of these types of noises.
arXiv Detail & Related papers (2021-01-29T18:44:05Z) - Silent Speech Interfaces for Speech Restoration: A Review [59.68902463890532]
Silent speech interface (SSI) research aims to provide alternative and augmentative communication methods for persons with severe speech disorders.
SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication.
Most present-day SSIs have only been validated in laboratory settings for healthy users.
arXiv Detail & Related papers (2020-09-04T11:05:50Z) - Understanding effect of speech perception in EEG based speech
recognition systems [3.5786621294068377]
The electroencephalography (EEG) signals recorded in parallel with speech are used to perform isolated and continuous speech recognition.
We investigate whether it is possible to separate out this speech perception component from EEG signals in order to design more robust EEG based speech recognition systems.
arXiv Detail & Related papers (2020-05-29T05:56:09Z) - Continuous Silent Speech Recognition using EEG [3.5786621294068377]
We translate EEG signals recorded in parallel while subjects were reading English sentences in their mind without producing any voice to text.
Our results demonstrate the feasibility of using EEG signals for performing continuous silent speech recognition.
arXiv Detail & Related papers (2020-02-06T18:28:45Z) - Visually Guided Self Supervised Learning of Speech Representations [62.23736312957182]
We propose a framework for learning audio representations guided by the visual modality in the context of audiovisual speech.
We employ a generative audio-to-video training scheme in which we animate a still image corresponding to a given audio clip and optimize the generated video to be as close as possible to the real video of the speech segment.
We achieve state of the art results for emotion recognition and competitive results for speech recognition.
arXiv Detail & Related papers (2020-01-13T14:53:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.