A Few-Shot Approach to Dysarthric Speech Intelligibility Level
Classification Using Transformers
- URL: http://arxiv.org/abs/2309.09329v1
- Date: Sun, 17 Sep 2023 17:23:41 GMT
- Title: A Few-Shot Approach to Dysarthric Speech Intelligibility Level
Classification Using Transformers
- Authors: Paleti Nikhil Chowdary, Vadlapudi Sai Aravind, Gorantla V N S L Vishnu
Vardhan, Menta Sai Akshay, Menta Sai Aashish, Jyothish Lal. G
- Abstract summary: Dysarthria is a speech disorder that hinders communication due to difficulties in articulating words.
Much of the literature focused on improving ASR systems for dysarthric speech.
This work aims to develop models that can accurately classify the presence of dysarthria.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Dysarthria is a speech disorder that hinders communication due to
difficulties in articulating words. Detection of dysarthria is important for
several reasons as it can be used to develop a treatment plan and help improve
a person's quality of life and ability to communicate effectively. Much of the
literature focused on improving ASR systems for dysarthric speech. The
objective of the current work is to develop models that can accurately classify
the presence of dysarthria and also give information about the intelligibility
level using limited data by employing a few-shot approach using a transformer
model. This work also aims to tackle the data leakage that is present in
previous studies. Our whisper-large-v2 transformer model trained on a subset of
the UASpeech dataset containing medium intelligibility level patients achieved
an accuracy of 85%, precision of 0.92, recall of 0.8 F1-score of 0.85, and
specificity of 0.91. Experimental results also demonstrate that the model
trained using the 'words' dataset performed better compared to the model
trained on the 'letters' and 'digits' dataset. Moreover, the multiclass model
achieved an accuracy of 67%.
Related papers
- Leveraging Pre-trained Models for Robust Federated Learning for Kidney Stone Type Recognition [1.7243216387069678]
Using pre-trained models, this research suggests a strong FL framework to improve kidney stone diagnosis.
We achieved a peak accuracy of 84.1% with seven epochs and 10 rounds during LPO stage, and 77.2% during FRV stage, showing enhanced diagnostic accuracy and robustness against image corruption.
arXiv Detail & Related papers (2024-09-30T04:23:47Z) - Automatic diagnosis of knee osteoarthritis severity using Swin
transformer [55.01037422579516]
Knee osteoarthritis (KOA) is a widespread condition that can cause chronic pain and stiffness in the knee joint.
We propose an automated approach that employs the Swin Transformer to predict the severity of KOA.
arXiv Detail & Related papers (2023-07-10T09:49:30Z) - Learning to diagnose cirrhosis from radiological and histological labels
with joint self and weakly-supervised pretraining strategies [62.840338941861134]
We propose to leverage transfer learning from large datasets annotated by radiologists, to predict the histological score available on a small annex dataset.
We compare different pretraining methods, namely weakly-supervised and self-supervised ones, to improve the prediction of the cirrhosis.
This method outperforms the baseline classification of the METAVIR score, reaching an AUC of 0.84 and a balanced accuracy of 0.75.
arXiv Detail & Related papers (2023-02-16T17:06:23Z) - Automatic Severity Classification of Dysarthric speech by using
Self-supervised Model with Multi-task Learning [4.947423926765435]
We propose a novel automatic severity assessment method for dysarthric speech using the self-supervised model in conjunction with multi-task learning.
Wav2vec 2.0 XLS-R is trained for two different tasks: severity classification and auxiliary automatic speech recognition (ASR)
Our model outperforms the traditional baseline methods, with a relative percentage increase of 1.25% for F1-score.
arXiv Detail & Related papers (2022-10-27T12:48:10Z) - ADT-SSL: Adaptive Dual-Threshold for Semi-Supervised Learning [68.53717108812297]
Semi-Supervised Learning (SSL) has advanced classification tasks by inputting both labeled and unlabeled data to train a model jointly.
This paper proposes an Adaptive Dual-Threshold method for Semi-Supervised Learning (ADT-SSL)
Experimental results show that the proposed ADT-SSL achieves state-of-the-art classification accuracy.
arXiv Detail & Related papers (2022-05-21T11:52:08Z) - On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and
Elderly Speech Recognition [53.17176024917725]
Scarcity of speaker-level data limits the practical use of data-intensive model based speaker adaptation methods.
This paper proposes two novel forms of data-efficient, feature-based on-the-fly speaker adaptation methods.
arXiv Detail & Related papers (2022-03-28T09:12:24Z) - Deep Learning-Based Detection of the Acute Respiratory Distress
Syndrome: What Are the Models Learning? [5.827840113217155]
acute respiratory distress syndrome (ARDS) is a severe form of hypoxemic respiratory failure with in-hospital mortality of 35-46%.
High mortality is thought to be related in part to challenges in making a prompt diagnosis, which may in turn delay implementation of evidence-based therapies.
A deep neural network (DNN) algorithm utilizing unbiased ventilator waveform data (VWD) may help to improve screening for ARDS.
arXiv Detail & Related papers (2021-09-25T09:10:10Z) - Assessing clinical utility of Machine Learning and Artificial
Intelligence approaches to analyze speech recordings in Multiple Sclerosis: A
Pilot Study [1.6582693134062305]
The aim of this study was to determine the potential clinical utility of machine learning and deep learning/AI approaches for the aiding of diagnosis, biomarker extraction and progression monitoring of multiple sclerosis using speech recordings.
The Random Forest model performed best, achieving an Accuracy of 0.82 on the validation dataset and an area-under-curve of 0.76 across 5 k-fold cycles on the training dataset.
arXiv Detail & Related papers (2021-09-20T21:02:37Z) - Bootstrapping Your Own Positive Sample: Contrastive Learning With
Electronic Health Record Data [62.29031007761901]
This paper proposes a novel contrastive regularized clinical classification model.
We introduce two unique positive sampling strategies specifically tailored for EHR data.
Our framework yields highly competitive experimental results in predicting the mortality risk on real-world COVID-19 EHR data.
arXiv Detail & Related papers (2021-04-07T06:02:04Z) - Self-Training with Improved Regularization for Sample-Efficient Chest
X-Ray Classification [80.00316465793702]
We present a deep learning framework that enables robust modeling in challenging scenarios.
Our results show that using 85% lesser labeled data, we can build predictive models that match the performance of classifiers trained in a large-scale data setting.
arXiv Detail & Related papers (2020-05-03T02:36:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.