Related papers: Bridging the Perceptual - Statistical Gap in Dysarthria Assessment: Why Machine Learning Still Falls Short

Bridging the Perceptual - Statistical Gap in Dysarthria Assessment: Why Machine Learning Still Falls Short

URL: http://arxiv.org/abs/2510.22237v1
Date: Sat, 25 Oct 2025 09:44:31 GMT
Title: Bridging the Perceptual - Statistical Gap in Dysarthria Assessment: Why Machine Learning Still Falls Short
Authors: Krishna Gurugubelli,
Abstract summary: Automated dysarthria detection and severity assessment from speech have attracted significant research attention due to their potential clinical impact.<n>Despite rapid progress in acoustic modeling and deep learning, models still fall short of human expert performance.<n>This manuscript provides a comprehensive analysis of the reasons behind this gap, emphasizing a conceptual divergence we term the perceptual-statistical gap''
Score: 3.4181221698258066
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automated dysarthria detection and severity assessment from speech have attracted significant research attention due to their potential clinical impact. Despite rapid progress in acoustic modeling and deep learning, models still fall short of human expert performance. This manuscript provides a comprehensive analysis of the reasons behind this gap, emphasizing a conceptual divergence we term the ``perceptual-statistical gap''. We detail human expert perceptual processes, survey machine learning representations and methods, review existing literature on feature sets and modeling strategies, and present a theoretical analysis of limits imposed by label noise and inter-rater variability. We further outline practical strategies to narrow the gap, perceptually motivated features, self-supervised pretraining, ASR-informed objectives, multimodal fusion, human-in-the-loop training, and explainability methods. Finally, we propose experimental protocols and evaluation metrics aligned with clinical goals to guide future research toward clinically reliable and interpretable dysarthria assessment tools.

Related papers

Understanding Catastrophic Interference: On the Identifibility of Latent Representations [67.05452287233122]
Catastrophic interference, also known as catastrophic forgetting, is a fundamental challenge in machine learning.<n>We propose a novel theoretical framework that formulates catastrophic interference as an identification problem.<n>Our approach provides both theoretical guarantees and practical performance improvements across both synthetic and benchmark datasets.
arXiv Detail & Related papers (2025-09-27T00:53:32Z)
Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z)
Predicting Treatment Response in Body Dysmorphic Disorder with Interpretable Machine Learning [0.4604003661048266]
Body Dysmorphic Disorder (BDD) is a highly prevalent and frequently underdiagnosed condition.<n>We employ multiple machine learning approaches to predict treatment outcomes.<n>Treatment credibility emerged as the most potent predictor.
arXiv Detail & Related papers (2025-03-13T17:39:10Z)
Uncertainty-aware abstention in medical diagnosis based on medical texts [87.88110503208016]
This study addresses the critical issue of reliability for AI-assisted medical diagnosis.<n>We focus on the selection prediction approach that allows the diagnosis system to abstain from providing the decision if it is not confident in the diagnosis.<n>We introduce HUQ-2, a new state-of-the-art method for enhancing reliability in selective prediction tasks.
arXiv Detail & Related papers (2025-02-25T10:15:21Z)
A Review of Deep Learning Approaches for Non-Invasive Cognitive Impairment Detection [35.31259047578382]
This review paper explores recent advances in deep learning approaches for non-invasive cognitive impairment detection. We examine various non-invasive indicators of cognitive decline, including speech and language, facial, and motoric mobility. Despite significant progress, several challenges remain, including data standardization and accessibility, model explainability, longitudinal analysis limitations, and clinical adaptation.
arXiv Detail & Related papers (2024-10-25T17:44:59Z)
A Survey of Automatic Hallucination Evaluation on Natural Language Generation [21.37538215193138]
The rapid advancement of Large Language Models (LLMs) has brought a pressing challenge: how to reliably assess hallucinations to guarantee model trustworthiness.<n>This survey addresses this critical gap through a systematic analysis of 105 evaluation methods, revealing that 77.1% specifically target LLMs.<n>We formulate a structured framework to organize the field, based on a survey of foundational datasets and benchmarks and a taxonomy of evaluation methodologies.
arXiv Detail & Related papers (2024-04-18T09:52:18Z)
Standardizing Your Training Process for Human Activity Recognition Models: A Comprehensive Review in the Tunable Factors [4.199844472131922]
We provide an exhaustive review of contemporary deep learning research in the field of wearable human activity recognition (WHAR) Our findings suggest that a major trend is the lack of detail provided by model training protocols. With insights from the analyses, we define a novel integrated training procedure tailored to the WHAR model.
arXiv Detail & Related papers (2024-01-10T17:45:28Z)
A study on the impact of Self-Supervised Learning on automatic dysarthric speech assessment [6.284142286798582]
We show that HuBERT is the most versatile feature extractor across dysarthria classification, word recognition, and intelligibility classification, achieving respectively $+24.7%, +61%, textand +7.2%$ accuracy compared to classical acoustic features.
arXiv Detail & Related papers (2023-06-07T11:04:02Z)
Ontology-aware Learning and Evaluation for Audio Tagging [56.59107110017436]
Mean average precision (mAP) metric treats different kinds of sound as independent classes without considering their relations. Ontology-aware mean average precision (OmAP) addresses the weaknesses of mAP by utilizing the AudioSet ontology information during the evaluation. We conduct human evaluations and demonstrate that OmAP is more consistent with human perception than mAP.
arXiv Detail & Related papers (2022-11-22T11:35:14Z)
Quantifying Explainability in NLP and Analyzing Algorithms for Performance-Explainability Tradeoff [0.0]
We explore the current art of explainability and interpretability within a case study in clinical text classification. We demonstrate various visualization techniques for fully interpretable methods as well as model-agnostic post hoc attributions. We introduce a framework through which practitioners and researchers can assess the frontier between a model's predictive performance and the quality of its available explanations.
arXiv Detail & Related papers (2021-07-12T19:07:24Z)
Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration [55.88616573143478]
Outcome prediction from clinical text can prevent doctors from overlooking possible risks. Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay prediction are four common outcome prediction targets. We propose clinical outcome pre-training to integrate knowledge about patient outcomes from multiple public sources.
arXiv Detail & Related papers (2021-02-08T10:26:44Z)
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions [48.91284724066349]
Off-policy evaluation in reinforcement learning offers the chance of using observational data to improve future outcomes in domains such as healthcare and education. Traditional measures such as confidence intervals may be insufficient due to noise, limited data and confounding. We develop a method that could serve as a hybrid human-AI system, to enable human experts to analyze the validity of policy evaluation estimates.
arXiv Detail & Related papers (2020-02-10T00:26:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.