Related papers: Revisiting Rule-Based Stuttering Detection: A Comprehensive Analysis of Interpretable Models for Clinical Applications

Revisiting Rule-Based Stuttering Detection: A Comprehensive Analysis of Interpretable Models for Clinical Applications

URL: http://arxiv.org/abs/2508.16681v1
Date: Thu, 21 Aug 2025 15:01:05 GMT
Title: Revisiting Rule-Based Stuttering Detection: A Comprehensive Analysis of Interpretable Models for Clinical Applications
Authors: Eric Zhang,
Abstract summary: This paper presents a comprehensive analysis of rule-based stuttering detection systems.<n>We propose an enhanced rule-based framework that incorporates speaking-rate normalization, multi-level acoustic feature analysis, and hierarchical decision structures.<n>We demonstrate that rule-based systems excel particularly in prolongation detection (97-99% accuracy) and provide stable performance across varying speaking rates.
Score: 5.692357910541593
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Stuttering affects approximately 1% of the global population, impacting communication and quality of life. While recent advances in deep learning have pushed the boundaries of automatic speech dysfluency detection, rule-based approaches remain crucial for clinical applications where interpretability and transparency are paramount. This paper presents a comprehensive analysis of rule-based stuttering detection systems, synthesizing insights from multiple corpora including UCLASS, FluencyBank, and SEP-28k. We propose an enhanced rule-based framework that incorporates speaking-rate normalization, multi-level acoustic feature analysis, and hierarchical decision structures. Our approach achieves competitive performance while maintaining complete interpretability-critical for clinical adoption. We demonstrate that rule-based systems excel particularly in prolongation detection (97-99% accuracy) and provide stable performance across varying speaking rates. Furthermore, we show how these interpretable models can be integrated with modern machine learning pipelines as proposal generators or constraint modules, bridging the gap between traditional speech pathology practices and contemporary AI systems. Our analysis reveals that while neural approaches may achieve marginally higher accuracy in unconstrained settings, rule-based methods offer unique advantages in clinical contexts where decision auditability, patient-specific tuning, and real-time feedback are essential.

Related papers

ClinDEF: A Dynamic Evaluation Framework for Large Language Models in Clinical Reasoning [58.01333341218153]
We propose ClinDEF, a dynamic framework for assessing clinical reasoning in LLMs through simulated diagnostic dialogues.<n>Our method generates patient cases and facilitates multi-turn interactions between an LLM-based doctor and an automated patient agent.<n>Experiments show that ClinDEF effectively exposes critical clinical reasoning gaps in state-of-the-art LLMs.
arXiv Detail & Related papers (2025-12-29T12:58:58Z)
Adaptive Diagnostic Reasoning Framework for Pathology with Multimodal Large Language Models [34.28963665009494]
We present RECAP-PATH, an interpretable framework that establishes a self-learning paradigm.<n>It shifts off-the-shelf multimodal large language models from passive pattern recognition to evidence-linked diagnostic reasoning.<n>This self-learning approach requires only small labeled sets and no white-box access or weight updates to generate cancer diagnoses.
arXiv Detail & Related papers (2025-11-15T03:06:59Z)
Timely Clinical Diagnosis through Active Test Selection [49.091903570068155]
We propose ACTMED (Adaptive Clinical Test selection via Model-based Experimental Design) to better emulate real-world diagnostic reasoning.<n>LLMs act as flexible simulators, generating plausible patient state distributions and supporting belief updates without requiring structured, task-specific training data.<n>We evaluate ACTMED on real-world datasets and show it can optimize test selection to improve diagnostic accuracy, interpretability, and resource use.
arXiv Detail & Related papers (2025-10-21T18:10:45Z)
Ensemble Deep Learning and LLM-Assisted Reporting for Automated Skin Lesion Diagnosis [2.9307254086347427]
We introduce a unified framework that reimagines AI integration for dermatological diagnostics.<n>First, a purposefully heterogeneous ensemble of architecturally diverse convolutional neural networks provides complementary diagnostic perspectives.<n>Second, we embed large language model capabilities directly into the diagnostic workflow, transforming classification outputs into clinically meaningful assessments.
arXiv Detail & Related papers (2025-10-05T08:07:33Z)
Multimodal Attention-Aware Fusion for Diagnosing Distal Myopathy: Evaluating Model Interpretability and Clinician Trust [19.107204920543676]
Distal myopathy represents a heterogeneous group of skeletal muscle disorders with broad clinical manifestations.<n>We propose a novel multimodal attention-aware fusion architecture that combines features extracted from two distinct deep learning models.<n>Our approach integrates these features through an attention gate mechanism, enhancing both predictive performance and interpretability.
arXiv Detail & Related papers (2025-08-02T11:08:55Z)
Medical Reasoning in the Era of LLMs: A Systematic Review of Enhancement Techniques and Applications [59.721265428780946]
Large Language Models (LLMs) in medicine have enabled impressive capabilities, yet a critical gap remains in their ability to perform systematic, transparent, and verifiable reasoning.<n>This paper provides the first systematic review of this emerging field.<n>We propose a taxonomy of reasoning enhancement techniques, categorized into training-time strategies and test-time mechanisms.
arXiv Detail & Related papers (2025-08-01T14:41:31Z)
Leveraging Embedding Techniques in Multimodal Machine Learning for Mental Illness Assessment [0.8458496687170665]
The increasing global prevalence of mental disorders, such as depression and PTSD, requires objective and scalable diagnostic tools.<n>This paper investigates the potential of multimodal machine learning to address these challenges, leveraging the complementary information available in text, audio, and video data.<n>We explore data-level, feature-level, and decision-level fusion techniques, including a novel integration of Large Language Model predictions.
arXiv Detail & Related papers (2025-04-02T14:19:06Z)
Systematic Literature Review on Clinical Trial Eligibility Matching [0.24554686192257422]
Review highlights how explainable AI and standardized ontology can bolster clinician trust and broaden adoption.<n>Further research into advanced semantic and temporal representations, expanded data integration, and rigorous prospective evaluations is necessary to fully realize the transformative potential of NLP in clinical trial recruitment.
arXiv Detail & Related papers (2025-03-02T11:45:50Z)
Doctor-in-the-Loop: An Explainable, Multi-View Deep Learning Framework for Predicting Pathological Response in Non-Small Cell Lung Cancer [0.6800826356148091]
Non-small cell lung cancer (NSCLC) remains a major global health challenge.<n>We propose Doctor-in-the-Loop, a novel framework that integrates expert-driven domain knowledge with explainable artificial intelligence techniques.<n>Our approach employs a gradual multi-view strategy, progressively refining the model's focus from broad contextual features to finer, lesion-specific details.
arXiv Detail & Related papers (2025-02-21T16:35:30Z)
Dialogue is Better Than Monologue: Instructing Medical LLMs via Strategical Conversations [74.83732294523402]
We introduce a novel benchmark that simulates real-world diagnostic scenarios, integrating noise and difficulty levels aligned with USMLE standards.<n>We also explore dialogue-based fine-tuning, which transforms static datasets into conversational formats to better capture iterative reasoning processes.<n>Experiments show that dialogue-tuned models outperform traditional methods, with improvements of $9.64%$ in multi-round reasoning scenarios and $6.18%$ in accuracy in a noisy environment.
arXiv Detail & Related papers (2025-01-29T18:58:48Z)
SemioLLM: Evaluating Large Language Models for Diagnostic Reasoning from Unstructured Clinical Narratives in Epilepsy [45.2233252981348]
Large Language Models (LLMs) have been shown to encode clinical knowledge.<n>We present SemioLLM, an evaluation framework that benchmarks 6 state-of-the-art models.<n>We show that most LLMs are able to accurately and confidently generate probabilistic predictions of seizure onset zones in the brain.
arXiv Detail & Related papers (2024-07-03T11:02:12Z)
Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges [58.32937972322058]
"Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image (MedAI 2021)" competitions. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic.
arXiv Detail & Related papers (2023-07-30T16:08:45Z)
The Medkit-Learn(ing) Environment: Medical Decision Modelling through Simulation [81.72197368690031]
We present a new benchmarking suite designed specifically for medical sequential decision making. The Medkit-Learn(ing) Environment is a publicly available Python package providing simple and easy access to high-fidelity synthetic medical data.
arXiv Detail & Related papers (2021-06-08T10:38:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.