Improving EEG based continuous speech recognition using GAN
- URL: http://arxiv.org/abs/2006.01260v1
- Date: Fri, 29 May 2020 06:11:33 GMT
- Title: Improving EEG based continuous speech recognition using GAN
- Authors: Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik
- Abstract summary: We demonstrate that it is possible to generate more meaningful electroencephalography (EEG) features from raw EEG features using generative adversarial networks (GAN)
Our proposed approach can be implemented without using any additional sensor information, whereas in [1] authors used additional features like acoustic or articulatory information to improve the performance of EEG based continuous speech recognition systems.
- Score: 3.5786621294068377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we demonstrate that it is possible to generate more meaningful
electroencephalography (EEG) features from raw EEG features using generative
adversarial networks (GAN) to improve the performance of EEG based continuous
speech recognition systems. We improve the results demonstrated by authors in
[1] using their data sets for for some of the test time experiments and for
other cases our results were comparable with theirs. Our proposed approach can
be implemented without using any additional sensor information, whereas in [1]
authors used additional features like acoustic or articulatory information to
improve the performance of EEG based continuous speech recognition systems.
Related papers
- NeuroSpex: Neuro-Guided Speaker Extraction with Cross-Modal Attention [47.8479647938849]
We present a neuro-guided speaker extraction model, i.e. NeuroSpex, using the EEG response of the listener as the sole auxiliary reference cue.
We propose a novel EEG signal encoder that captures the attention information. Additionally, we propose a cross-attention (CA) mechanism to enhance the speech feature representations.
arXiv Detail & Related papers (2024-09-04T07:33:01Z) - Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study
on Speech Emotion Recognition [54.952250732643115]
We study Acoustic Word Embeddings (AWEs), a fixed-length feature derived from continuous representations, to explore their advantages in specific tasks.
AWEs have previously shown utility in capturing acoustic discriminability.
Our findings underscore the acoustic context conveyed by AWEs and showcase the highly competitive Speech Emotion Recognition accuracies.
arXiv Detail & Related papers (2024-02-04T21:24:54Z) - DGSD: Dynamical Graph Self-Distillation for EEG-Based Auditory Spatial
Attention Detection [49.196182908826565]
Auditory Attention Detection (AAD) aims to detect target speaker from brain signals in a multi-speaker environment.
Current approaches primarily rely on traditional convolutional neural network designed for processing Euclidean data like images.
This paper proposes a dynamical graph self-distillation (DGSD) approach for AAD, which does not require speech stimuli as input.
arXiv Detail & Related papers (2023-09-07T13:43:46Z) - Multimodal Emotion Recognition using Transfer Learning from Speaker
Recognition and BERT-based models [53.31917090073727]
We propose a neural network-based emotion recognition framework that uses a late fusion of transfer-learned and fine-tuned models from speech and text modalities.
We evaluate the effectiveness of our proposed multimodal approach on the interactive emotional dyadic motion capture dataset.
arXiv Detail & Related papers (2022-02-16T00:23:42Z) - An Exploration of Self-Supervised Pretrained Representations for
End-to-End Speech Recognition [98.70304981174748]
We focus on the general applications of pretrained speech representations, on advanced end-to-end automatic speech recognition (E2E-ASR) models.
We select several pretrained speech representations and present the experimental results on various open-source and publicly available corpora for E2E-ASR.
arXiv Detail & Related papers (2021-10-09T15:06:09Z) - Brain Signals to Rescue Aphasia, Apraxia and Dysarthria Speech
Recognition [14.544989316741091]
We propose a deep learning-based algorithm to improve the performance of automatic speech recognition systems for aphasia, apraxia, and dysarthria speech.
We demonstrate a significant decoding performance improvement by more than 50% during test time for isolated speech recognition task.
Results show the first step towards demonstrating the possibility of utilizing non-invasive neural signals to design a real-time robust speech prosthetic for stroke survivors recovering from aphasia, apraxia, and dysarthria.
arXiv Detail & Related papers (2021-02-28T03:27:02Z) - Constrained Variational Autoencoder for improving EEG based Speech
Recognition Systems [3.5786621294068377]
We introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function.
We demonstrate that both continuous and isolated speech recognition systems trained and tested using EEG features generated from raw EEG features.
arXiv Detail & Related papers (2020-06-01T06:03:50Z) - Understanding effect of speech perception in EEG based speech
recognition systems [3.5786621294068377]
The electroencephalography (EEG) signals recorded in parallel with speech are used to perform isolated and continuous speech recognition.
We investigate whether it is possible to separate out this speech perception component from EEG signals in order to design more robust EEG based speech recognition systems.
arXiv Detail & Related papers (2020-05-29T05:56:09Z) - Generating EEG features from Acoustic features [13.089515271477824]
We use recurrent neural network (RNN) based regression model and generative adversarial network (GAN) to predict EEG features from acoustic features.
We compare our results with the previously studied problem on speech synthesis using EEG.
arXiv Detail & Related papers (2020-02-29T16:44:08Z) - Speech Synthesis using EEG [4.312746668772343]
We make use of a recurrent neural network (RNN) regression model to predict acoustic features directly from EEG features.
We provide EEG based speech synthesis results for four subjects in this paper.
arXiv Detail & Related papers (2020-02-22T03:53:45Z) - Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention [70.82604384963679]
This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features.
We extract a speaker representation used for adaptation directly from the test utterance.
arXiv Detail & Related papers (2020-02-14T05:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.