Understanding effect of speech perception in EEG based speech
recognition systems
- URL: http://arxiv.org/abs/2006.01261v1
- Date: Fri, 29 May 2020 05:56:09 GMT
- Title: Understanding effect of speech perception in EEG based speech
recognition systems
- Authors: Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
- Abstract summary: The electroencephalography (EEG) signals recorded in parallel with speech are used to perform isolated and continuous speech recognition.
We investigate whether it is possible to separate out this speech perception component from EEG signals in order to design more robust EEG based speech recognition systems.
- Score: 3.5786621294068377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The electroencephalography (EEG) signals recorded in parallel with speech are
used to perform isolated and continuous speech recognition. During speaking
process, one also hears his or her own speech and this speech perception is
also reflected in the recorded EEG signals. In this paper we investigate
whether it is possible to separate out this speech perception component from
EEG signals in order to design more robust EEG based speech recognition
systems. We further demonstrate predicting EEG signals recorded in parallel
with speaking from EEG signals recorded in parallel with passive listening and
vice versa with very low normalized root mean squared error (RMSE). We finally
demonstrate both isolated and continuous speech recognition using EEG signals
recorded in parallel with listening, speaking and improve the previous
connectionist temporal classification (CTC) model results demonstrated by
authors in [1] using their data set.
Related papers
- NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention [47.8479647938849]
We present a neuro-guided speaker extraction model, i.e. NeuroSpex, using the EEG response of the listener as the sole auxiliary reference cue.
We propose a novel EEG signal encoder that captures the attention information. Additionally, we propose a cross-attention (CA) mechanism to enhance the speech feature representations.
arXiv Detail & Related papers (2024-09-04T07:33:01Z) - Audio-visual End-to-end Multi-channel Speech Separation, Dereverberation
and Recognition [52.11964238935099]
An audio-visual multi-channel speech separation, dereverberation and recognition approach is proposed in this paper.
Video input is consistently demonstrated in mask-based MVDR speech separation, DNN-WPE or spectral mapping (SpecM) based speech dereverberation front-end.
Experiments were conducted on the mixture overlapped and reverberant speech data constructed using simulation or replay of the Oxford LRS2 dataset.
arXiv Detail & Related papers (2023-07-06T10:50:46Z) - Inner speech recognition through electroencephalographic signals [2.578242050187029]
This work focuses on inner speech recognition starting from EEG signals.
The decoding of the EEG into text should be understood as the classification of a limited number of words (commands)
Speech-related BCIs provide effective vocal communication strategies for controlling devices through speech commands interpreted from brain signals.
arXiv Detail & Related papers (2022-10-11T08:29:12Z) - Audio-visual multi-channel speech separation, dereverberation and
recognition [70.34433820322323]
This paper proposes an audio-visual multi-channel speech separation, dereverberation and recognition approach.
The advantage of the additional visual modality over using audio only is demonstrated on two neural dereverberation approaches.
Experiments conducted on the LRS2 dataset suggest that the proposed audio-visual multi-channel speech separation, dereverberation and recognition system outperforms the baseline.
arXiv Detail & Related papers (2022-04-05T04:16:03Z) - Streaming Multi-talker Speech Recognition with Joint Speaker
Identification [77.46617674133556]
SURIT employs the recurrent neural network transducer (RNN-T) as the backbone for both speech recognition and speaker identification.
We validate our idea on the Librispeech dataset -- a multi-talker dataset derived from Librispeech, and present encouraging results.
arXiv Detail & Related papers (2021-04-05T18:37:33Z) - Continuous Speech Separation with Conformer [60.938212082732775]
We use transformer and conformer in lieu of recurrent neural networks in the separation system.
We believe capturing global information with the self-attention based method is crucial for the speech separation.
arXiv Detail & Related papers (2020-08-13T09:36:05Z) - Constrained Variational Autoencoder for improving EEG based Speech
Recognition Systems [3.5786621294068377]
We introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function.
We demonstrate that both continuous and isolated speech recognition systems trained and tested using EEG features generated from raw EEG features.
arXiv Detail & Related papers (2020-06-01T06:03:50Z) - Predicting Different Acoustic Features from EEG and towards direct
synthesis of Audio Waveform from EEG [3.5786621294068377]
Authors provided preliminary results for synthesizing speech from electroencephalography (EEG) features.
Deep learning model takes raw EEG waveform signals as input and directly produces audio waveform as output.
Results presented in this paper shows how different acoustic features are related to non-invasive neural EEG signals recorded during speech perception and production.
arXiv Detail & Related papers (2020-05-29T05:50:03Z) - Speech Synthesis using EEG [4.312746668772343]
We make use of a recurrent neural network (RNN) regression model to predict acoustic features directly from EEG features.
We provide EEG based speech synthesis results for four subjects in this paper.
arXiv Detail & Related papers (2020-02-22T03:53:45Z) - Continuous Silent Speech Recognition using EEG [3.5786621294068377]
We translate EEG signals recorded in parallel while subjects were reading English sentences in their mind without producing any voice to text.
Our results demonstrate the feasibility of using EEG signals for performing continuous silent speech recognition.
arXiv Detail & Related papers (2020-02-06T18:28:45Z) - Continuous speech separation: dataset and analysis [52.10378896407332]
In natural conversations, a speech signal is continuous, containing both overlapped and overlap-free components.
This paper describes a dataset and protocols for evaluating continuous speech separation algorithms.
arXiv Detail & Related papers (2020-01-30T18:01:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.