NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
- URL: http://arxiv.org/abs/2403.02371v3
- Date: Wed, 26 Feb 2025 15:42:41 GMT
- Title: NeuroVoz: a Castillian Spanish corpus of parkinsonian speech
- Authors: Janaína Mendes-Laureano, Jorge A. Gómez-García, Alejandro Guerrero-López, Elisa Luque-Buzo, Julián D. Arias-Londoño, Francisco J. Grandas-Pérez, Juan I. Godino-Llorente,
- Abstract summary: This manuscript presents the NeuroVoz corpus consisting of 112 native Castilian-Spanish speakers, including 58 healthy controls and 54 individuals with PD.<n>The dataset is also complemented with subjective assessments of voice quality performed by an expert according to the GRBAS scale.<n>This data set has already supported several studies, achieving a benchmark accuracy of 89% for the screening of PD.
- Score: 34.916222066004465
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The screening of Parkinson's Disease (PD) through speech is hindered by a notable lack of publicly available datasets in different languages. This fact limits the reproducibility and further exploration of existing research. To address this gap, this manuscript presents the NeuroVoz corpus consisting of 112 native Castilian-Spanish speakers, including 58 healthy controls and 54 individuals with PD, all recorded in ON state. The corpus showcases a diverse array of speech tasks: sustained vowels; diadochokinetic tests; 16 Listen-and-Repeat utterances; and spontaneous monologues. The dataset is also complemented with subjective assessments of voice quality performed by an expert according to the GRBAS scale (Grade/Roughness/Breathiness/Asthenia/Strain), as well as annotations with a thorough examination of phonation quality, intensity, speed, resonance, intelligibility, and prosody. The corpus offers a substantial resource for the exploration of the impact of PD on speech. This data set has already supported several studies, achieving a benchmark accuracy of 89% for the screening of PD. Despite these advances, the broader challenge of conducting a language-agnostic, cross-corpora analysis of Parkinsonian speech patterns remains open.
Related papers
- Does Language Matter for Early Detection of Parkinson's Disease from Speech? [9.968776083852813]
Using speech samples as a biomarker is a promising avenue for detecting and monitoring the progression of Parkinson's disease (PD)<n>To assess the role of language in PD detection, we tested pretrained models with varying data types and pretraining objectives.
arXiv Detail & Related papers (2025-07-14T19:23:09Z) - Interpretable Early Detection of Parkinson's Disease through Speech Analysis [0.24466725954625887]
We propose a deep learning approach for early Parkinson's disease detection from speech recordings.
This approach seeks to associate predictive speech patterns with articulatory features.
We evaluated our approach using the Italian Parkinson's Voice and Speech Database, containing 831 audio recordings from 65 participants.
arXiv Detail & Related papers (2025-04-24T16:50:52Z) - Language-Agnostic Analysis of Speech Depression Detection [2.5764071253486636]
This work analyzes automatic speech-based depression detection across two languages, English and Malayalam.
A CNN model is trained to identify acoustic features associated with depression in speech, focusing on both languages.
Our findings and collected data could contribute to the development of language-agnostic speech-based depression detection systems.
arXiv Detail & Related papers (2024-09-23T07:35:56Z) - Self-supervised Speech Models for Word-Level Stuttered Speech Detection [66.46810024006712]
We introduce a word-level stuttering speech detection model leveraging self-supervised speech models.
Our evaluation demonstrates that our model surpasses previous approaches in word-level stuttering speech detection.
arXiv Detail & Related papers (2024-09-16T20:18:20Z) - Enhancing Voice Wake-Up for Dysarthria: Mandarin Dysarthria Speech Corpus Release and Customized System Design [58.50329724298128]
This paper addresses the wake-up word spotting (WWS) task for dysarthric individuals, aiming to integrate them into real-world applications.
We release the open-source Mandarin Dysarthria Speech Corpus (MDSC), a dataset designed for dysarthric individuals in home environments.
We also develop a customized dysarthria WWS system that showcases robustness in handling intelligibility and achieving exceptional performance.
arXiv Detail & Related papers (2024-06-14T03:06:55Z) - EARS: An Anechoic Fullband Speech Dataset Benchmarked for Speech Enhancement and Dereverberation [83.29199726650899]
The EARS dataset comprises 107 speakers from diverse backgrounds, totaling in 100 hours of clean, anechoic speech data.
The dataset covers a large range of different speaking styles, including emotional speech, different reading styles, non-verbal sounds, and conversational freeform speech.
We benchmark various methods for speech enhancement and dereverberation on the dataset and evaluate their performance through a set of instrumental metrics.
arXiv Detail & Related papers (2024-06-10T11:28:29Z) - Exploring Speech Pattern Disorders in Autism using Machine Learning [12.469348589699766]
This study presents a comprehensive approach to identify distinctive speech patterns through the analysis of examiner-patient dialogues.
We extracted 40 speech-related features, categorized into frequency, zero-crossing rate, energy, spectral characteristics, Mel Frequency Cepstral Coefficients (MFCCs) and balance.
The classification model aimed to differentiate between ASD and non-ASD cases, achieving an accuracy of 87.75%.
arXiv Detail & Related papers (2024-05-03T02:59:15Z) - Decoding speech perception from non-invasive brain recordings [48.46819575538446]
We introduce a model trained with contrastive-learning to decode self-supervised representations of perceived speech from non-invasive recordings.
Our model can identify, from 3 seconds of MEG signals, the corresponding speech segment with up to 41% accuracy out of more than 1,000 distinct possibilities.
arXiv Detail & Related papers (2022-08-25T10:01:43Z) - Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging
Features For Elderly And Dysarthric Speech Recognition [55.25565305101314]
Articulatory features are invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition systems.
This paper presents a cross-domain and cross-lingual A2A inversion approach that utilizes the parallel audio and ultrasound tongue imaging (UTI) data of the 24-hour TaL corpus in A2A model pre-training.
Experiments conducted on three tasks suggested incorporating the generated articulatory features consistently outperformed the baseline TDNN and Conformer ASR systems.
arXiv Detail & Related papers (2022-06-15T07:20:28Z) - Parkinson's disease diagnostics using AI and natural language knowledge
transfer [0.0]
Deep learning approach for classification of raw speech recordings in patients with diagnosed PD was proposed.
Method was tested on a group of 38 PD patients and 10 healthy persons above the age of 50.
arXiv Detail & Related papers (2022-04-26T19:39:29Z) - Cross-lingual Self-Supervised Speech Representations for Improved
Dysarthric Speech Recognition [15.136348385992047]
This study explores the usefulness of using Wav2Vec self-supervised speech representations as features for training an ASR system for dysarthric speech.
We train an acoustic model with features extracted from Wav2Vec, Hubert, and the cross-lingual XLSR model.
Results suggest that speech representations pretrained on large unlabelled data can improve word error rate (WER) performance.
arXiv Detail & Related papers (2022-04-04T17:36:01Z) - Investigation of Data Augmentation Techniques for Disordered Speech
Recognition [69.50670302435174]
This paper investigates a set of data augmentation techniques for disordered speech recognition.
Both normal and disordered speech were exploited in the augmentation process.
The final speaker adapted system constructed using the UASpeech corpus and the best augmentation approach based on speed perturbation produced up to 2.92% absolute word error rate (WER)
arXiv Detail & Related papers (2022-01-14T17:09:22Z) - The Phonetic Footprint of Parkinson's Disease [16.64383793837174]
Parkinson's disease (PD) has a significant impact on the fine motor skills of patients.
Characteristic patterns such as vowel instability, slurred pronunciation and slow speech can often be observed in the affected individuals.
We used a phonetic recognizer trained exclusively on healthy speech data to investigate how PD affected the phonetic footprint of patients.
arXiv Detail & Related papers (2021-12-21T20:44:21Z) - Silent Speech Interfaces for Speech Restoration: A Review [59.68902463890532]
Silent speech interface (SSI) research aims to provide alternative and augmentative communication methods for persons with severe speech disorders.
SSIs rely on non-acoustic biosignals generated by the human body during speech production to enable communication.
Most present-day SSIs have only been validated in laboratory settings for healthy users.
arXiv Detail & Related papers (2020-09-04T11:05:50Z) - Detecting Parkinson's Disease From an Online Speech-task [4.968576908394359]
In this paper, we envision a web-based framework that can help anyone, anywhere around the world record a short speech task, and analyze the recorded data to screen for Parkinson's disease (PD)
We collected data from 726 unique participants (262 PD, 38% female; 464 non-PD, 65% female; average age: 61) from all over the US and beyond.
We extracted both standard acoustic features (MFCC), jitter and shimmer variants, and deep learning based features from the speech data.
Our model performed equally well on data collected in controlled lab environment as well as 'in the wild'
arXiv Detail & Related papers (2020-09-02T21:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.