Convolutional Neural Networks and a Transfer Learning Strategy to
Classify Parkinson's Disease from Speech in Three Different Languages
- URL: http://arxiv.org/abs/2002.04374v1
- Date: Tue, 11 Feb 2020 13:48:38 GMT
- Title: Convolutional Neural Networks and a Transfer Learning Strategy to
Classify Parkinson's Disease from Speech in Three Different Languages
- Authors: J. C. V\'asquez-Correa, T. Arias-Vergara, C. D. Rios-Urrego, M.
Schuster, J. Rusz, J. R. Orozco-Arroyave, E. N\"oth
- Abstract summary: This paper introduces a methodology to classify Parkinson's disease from speech in three different languages: Spanish, German, and Czech.
The proposed approach considers convolutional neural networks trained with time frequency representations and a transfer learning strategy among the three languages.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Parkinson's disease patients develop different speech impairments that affect
their communication capabilities. The automatic assessment of the speech of the
patients allows the development of computer aided tools to support the
diagnosis and the evaluation of the disease severity. This paper introduces a
methodology to classify Parkinson's disease from speech in three different
languages: Spanish, German, and Czech. The proposed approach considers
convolutional neural networks trained with time frequency representations and a
transfer learning strategy among the three languages. The transfer learning
scheme aims to improve the accuracy of the models when the weights of the
neural network are initialized with utterances from a different language than
the used for the test set. The results suggest that the proposed strategy
improves the accuracy of the models in up to 8\% when the base model used to
initialize the weights of the classifier is robust enough. In addition, the
results obtained after the transfer learning are in most cases more balanced in
terms of specificity-sensitivity than those trained without the transfer
learning strategy.
Related papers
- A Lesion-aware Edge-based Graph Neural Network for Predicting Language Ability in Patients with Post-stroke Aphasia [12.129896943547912]
We propose a lesion-aware graph neural network (LEGNet) to predict language ability from resting-state fMRI (rs-fMRI) connectivity in patients with post-stroke aphasia.
Our model integrates three components: an edge-based learning module that encodes functional connectivity between brain regions, a lesion encoding module, and a subgraph learning module.
arXiv Detail & Related papers (2024-09-03T21:28:48Z) - Adapting Mental Health Prediction Tasks for Cross-lingual Learning via Meta-Training and In-context Learning with Large Language Model [3.3590922002216193]
We use model-agnostic meta-learning and leveraging large language models (LLMs) to address this gap.
We first apply a meta-learning model with self-supervision, which results in improved model initialisation for rapid adaptation and cross-lingual transfer.
In parallel, we use LLMs' in-context learning capabilities to assess their performance accuracy across the Swahili mental health prediction tasks.
arXiv Detail & Related papers (2024-04-13T17:11:35Z) - Neural Sign Actors: A diffusion model for 3D sign language production from text [51.81647203840081]
Sign Languages (SL) serve as the primary mode of communication for the Deaf and Hard of Hearing communities.
This work makes an important step towards realistic neural sign avatars, bridging the communication gap between Deaf and hearing communities.
arXiv Detail & Related papers (2023-12-05T12:04:34Z) - Automatically measuring speech fluency in people with aphasia: first
achievements using read-speech data [55.84746218227712]
This study aims at assessing the relevance of a signalprocessingalgorithm, initially developed in the field of language acquisition, for the automatic measurement of speech fluency.
arXiv Detail & Related papers (2023-08-09T07:51:40Z) - Assessing Language Disorders using Artificial Intelligence: a Paradigm
Shift [0.13393465195776774]
Speech, language, and communication deficits are present in most neurodegenerative syndromes.
We argue that using machine learning methodologies, natural language processing, and modern artificial intelligence (AI) for Language Assessment is an improvement over conventional manual assessment.
arXiv Detail & Related papers (2023-05-31T17:20:45Z) - Federated learning for secure development of AI models for Parkinson's
disease detection using speech from different languages [10.04992537510352]
In this paper, we employ federated learning (FL) for PD detection using speech signals from 3 real-world language corpora of German, Spanish, and Czech.
Our results indicate that the FL model outperforms all the local models in terms of diagnostic accuracy, while not performing very differently from the model based on centrally combined training sets.
arXiv Detail & Related papers (2023-05-18T20:04:55Z) - Neural Language Models are not Born Equal to Fit Brain Data, but
Training Helps [75.84770193489639]
We examine the impact of test loss, training corpus and model architecture on the prediction of functional Magnetic Resonance Imaging timecourses of participants listening to an audiobook.
We find that untrained versions of each model already explain significant amount of signal in the brain by capturing similarity in brain responses across identical words.
We suggest good practices for future studies aiming at explaining the human language system using neural language models.
arXiv Detail & Related papers (2022-07-07T15:37:17Z) - Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging
Features For Elderly And Dysarthric Speech Recognition [55.25565305101314]
Articulatory features are invariant to acoustic signal distortion and have been successfully incorporated into automatic speech recognition systems.
This paper presents a cross-domain and cross-lingual A2A inversion approach that utilizes the parallel audio and ultrasound tongue imaging (UTI) data of the 24-hour TaL corpus in A2A model pre-training.
Experiments conducted on three tasks suggested incorporating the generated articulatory features consistently outperformed the baseline TDNN and Conformer ASR systems.
arXiv Detail & Related papers (2022-06-15T07:20:28Z) - Model-based analysis of brain activity reveals the hierarchy of language
in 305 subjects [82.81964713263483]
A popular approach to decompose the neural bases of language consists in correlating, across individuals, the brain responses to different stimuli.
Here, we show that a model-based approach can reach equivalent results within subjects exposed to natural stimuli.
arXiv Detail & Related papers (2021-10-12T15:30:21Z) - Factorized Neural Transducer for Efficient Language Model Adaptation [51.81097243306204]
We propose a novel model, factorized neural Transducer, by factorizing the blank and vocabulary prediction.
It is expected that this factorization can transfer the improvement of the standalone language model to the Transducer for speech recognition.
We demonstrate that the proposed factorized neural Transducer yields 15% to 20% WER improvements when out-of-domain text data is used for language model adaptation.
arXiv Detail & Related papers (2021-09-27T15:04:00Z) - Multi-Modal Detection of Alzheimer's Disease from Speech and Text [3.702631194466718]
We propose a deep learning method that utilizes speech and the corresponding transcript simultaneously to detect Alzheimer's disease (AD)
The proposed method achieves 85.3% 10-fold cross-validation accuracy when trained and evaluated on the Dementiabank Pitt corpus.
arXiv Detail & Related papers (2020-11-30T21:18:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.