Brain Signals to Rescue Aphasia, Apraxia and Dysarthria Speech
Recognition
- URL: http://arxiv.org/abs/2103.00383v1
- Date: Sun, 28 Feb 2021 03:27:02 GMT
- Title: Brain Signals to Rescue Aphasia, Apraxia and Dysarthria Speech
Recognition
- Authors: Gautam Krishna, Mason Carnahan, Shilpa Shamapant, Yashitha
Surendranath, Saumya Jain, Arundhati Ghosh, Co Tran, Jose del R Millan and
Ahmed H Tewfik
- Abstract summary: We propose a deep learning-based algorithm to improve the performance of automatic speech recognition systems for aphasia, apraxia, and dysarthria speech.
We demonstrate a significant decoding performance improvement by more than 50% during test time for isolated speech recognition task.
Results show the first step towards demonstrating the possibility of utilizing non-invasive neural signals to design a real-time robust speech prosthetic for stroke survivors recovering from aphasia, apraxia, and dysarthria.
- Score: 14.544989316741091
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we propose a deep learning-based algorithm to improve the
performance of automatic speech recognition (ASR) systems for aphasia, apraxia,
and dysarthria speech by utilizing electroencephalography (EEG) features
recorded synchronously with aphasia, apraxia, and dysarthria speech. We
demonstrate a significant decoding performance improvement by more than 50\%
during test time for isolated speech recognition task and we also provide
preliminary results indicating performance improvement for the more challenging
continuous speech recognition task by utilizing EEG features. The results
presented in this paper show the first step towards demonstrating the
possibility of utilizing non-invasive neural signals to design a real-time
robust speech prosthetic for stroke survivors recovering from aphasia, apraxia,
and dysarthria. Our aphasia, apraxia, and dysarthria speech-EEG data set will
be released to the public to help further advance this interesting and crucial
research.
Related papers
- Exploring Speech Pattern Disorders in Autism using Machine Learning [12.469348589699766]
This study presents a comprehensive approach to identify distinctive speech patterns through the analysis of examiner-patient dialogues.
We extracted 40 speech-related features, categorized into frequency, zero-crossing rate, energy, spectral characteristics, Mel Frequency Cepstral Coefficients (MFCCs) and balance.
The classification model aimed to differentiate between ASD and non-ASD cases, achieving an accuracy of 87.75%.
arXiv Detail & Related papers (2024-05-03T02:59:15Z) - A New Benchmark of Aphasia Speech Recognition and Detection Based on
E-Branchformer and Multi-task Learning [29.916793641951507]
This paper presents a new benchmark for Aphasia speech recognition using state-of-the-art speech recognition techniques.
We introduce two multi-task learning methods based on the CTC/Attention architecture to perform both tasks simultaneously.
Our system achieves state-of-the-art speaker-level detection accuracy (97.3%), and a relative WER reduction of 11% for moderate Aphasia patients.
arXiv Detail & Related papers (2023-05-19T15:10:36Z) - Decoding speech perception from non-invasive brain recordings [48.46819575538446]
We introduce a model trained with contrastive-learning to decode self-supervised representations of perceived speech from non-invasive recordings.
Our model can identify, from 3 seconds of MEG signals, the corresponding speech segment with up to 41% accuracy out of more than 1,000 distinct possibilities.
arXiv Detail & Related papers (2022-08-25T10:01:43Z) - Recent Progress in the CUHK Dysarthric Speech Recognition System [66.69024814159447]
Disordered speech presents a wide spectrum of challenges to current data intensive deep neural networks (DNNs) based automatic speech recognition technologies.
This paper presents recent research efforts at the Chinese University of Hong Kong to improve the performance of disordered speech recognition systems.
arXiv Detail & Related papers (2022-01-15T13:02:40Z) - Investigation of Data Augmentation Techniques for Disordered Speech
Recognition [69.50670302435174]
This paper investigates a set of data augmentation techniques for disordered speech recognition.
Both normal and disordered speech were exploited in the augmentation process.
The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92% absolute word error rate (WER)
arXiv Detail & Related papers (2022-01-14T17:09:22Z) - The Effectiveness of Time Stretching for Enhancing Dysarthric Speech for
Improved Dysarthric Speech Recognition [24.07996218669781]
We investigate existing and a new state-of-the-art generative adversarial network-based (GAN) voice conversion method for enhancing dysarthric speech for improved dysarthric speech recognition.
We find that straightforward signal processing methods such as stationary noise removal and vocoder-based time stretching lead to dysarthric speech recognition results comparable to those obtained when using state-of-the-art GAN-based voice conversion methods.
arXiv Detail & Related papers (2022-01-13T11:56:13Z) - A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker
Identity in Dysarthric Voice Conversion [50.040466658605524]
We propose a new paradigm for maintaining speaker identity in dysarthric voice conversion (DVC)
The poor quality of dysarthric speech can be greatly improved by statistical VC.
But as the normal speech utterances of a dysarthria patient are nearly impossible to collect, previous work failed to recover the individuality of the patient.
arXiv Detail & Related papers (2021-06-02T18:41:03Z) - NUVA: A Naming Utterance Verifier for Aphasia Treatment [49.114436579008476]
Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA)
Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus'incorrect' naming attempts from aphasic stroke patients.
When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%.
arXiv Detail & Related papers (2021-02-10T13:00:29Z) - Constrained Variational Autoencoder for improving EEG based Speech
Recognition Systems [3.5786621294068377]
We introduce a recurrent neural network (RNN) based variational autoencoder (VAE) model with a new constrained loss function.
We demonstrate that both continuous and isolated speech recognition systems trained and tested using EEG features generated from raw EEG features.
arXiv Detail & Related papers (2020-06-01T06:03:50Z) - Understanding effect of speech perception in EEG based speech
recognition systems [3.5786621294068377]
The electroencephalography (EEG) signals recorded in parallel with speech are used to perform isolated and continuous speech recognition.
We investigate whether it is possible to separate out this speech perception component from EEG signals in order to design more robust EEG based speech recognition systems.
arXiv Detail & Related papers (2020-05-29T05:56:09Z) - Speech Enhancement using Self-Adaptation and Multi-Head Self-Attention [70.82604384963679]
This paper investigates a self-adaptation method for speech enhancement using auxiliary speaker-aware features.
We extract a speaker representation used for adaptation directly from the test utterance.
arXiv Detail & Related papers (2020-02-14T05:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.