Related papers: Applications of Artificial Intelligence for Cross-language Intelligibility Assessment of Dysarthric Speech

Applications of Artificial Intelligence for Cross-language Intelligibility Assessment of Dysarthric Speech

URL: http://arxiv.org/abs/2501.15858v4
Date: Thu, 08 May 2025 13:22:39 GMT
Title: Applications of Artificial Intelligence for Cross-language Intelligibility Assessment of Dysarthric Speech
Authors: Eunjung Yeo, Julie Liss, Visar Berisha, David Mortensen,
Abstract summary: This commentary introduces a conceptual framework to advance cross-language intelligibility assessment of dysarthric speech.<n>We propose a universal speech model that encodes dysarthric speech into acoustic-phonetic representations, followed by a language-specific intelligibility assessment model.
Score: 13.475654818182988
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Purpose: Speech intelligibility is a critical outcome in the assessment and management of dysarthria, yet most research and clinical practices have focused on English, limiting their applicability across languages. This commentary introduces a conceptual framework--and a demonstration of how it can be implemented--leveraging artificial intelligence (AI) to advance cross-language intelligibility assessment of dysarthric speech. Method: We propose a two-tiered conceptual framework consisting of a universal speech model that encodes dysarthric speech into acoustic-phonetic representations, followed by a language-specific intelligibility assessment model that interprets these representations within the phonological or prosodic structures of the target language. We further identify barriers to cross-language intelligibility assessment of dysarthric speech, including data scarcity, annotation complexity, and limited linguistic insights into dysarthric speech, and outline potential AI-driven solutions to overcome these challenges. Conclusion: Advancing cross-language intelligibility assessment of dysarthric speech necessitates models that are both efficient and scalable, yet constrained by linguistic rules to ensure accurate and language-sensitive assessment. Recent advances in AI provide the foundational tools to support this integration, shaping future directions toward generalizable and linguistically informed assessment frameworks.

Related papers

CogBench: A Large Language Model Benchmark for Multilingual Speech-Based Cognitive Impairment Assessment [13.74065648648307]
We propose CogBench, the first benchmark designed to evaluate the cross-lingual and cross-site generalizability of large language models for speech-based cognitive impairment assessment.<n>Our results show that conventional deep learning models degrade substantially when transferred across domains.<n>Our findings offer a critical step toward building clinically useful and linguistically robust speech-based cognitive assessment tools.
arXiv Detail & Related papers (2025-08-05T12:06:16Z)
Machine-Facing English: Defining a Hybrid Register Shaped by Human-AI Discourse [3.665768771606006]
Machine-Facing English (MFE) is an emergent register shaped by the adaptation of everyday language to the expanding presence of AI interlocutors.<n>This study traces how sustained human-AI interaction normalizes syntactic rigidity, pragmatic simplification, and hyper-explicit phrasing.
arXiv Detail & Related papers (2025-05-29T03:22:39Z)
Exploring Generative Error Correction for Dysarthric Speech Recognition [12.584296717901116]
We propose a two-stage framework for the Speech Accessibility Project Challenge at INTERSPEECH 2025.<n>We assess different configurations of model scales and training strategies, incorporating specific hypothesis selection to improve transcription accuracy.<n>We provide insights into the complementary roles of acoustic and linguistic modeling in dysarthric speech recognition.
arXiv Detail & Related papers (2025-05-26T16:06:31Z)
Speech-IFEval: Evaluating Instruction-Following and Quantifying Catastrophic Forgetting in Speech-Aware Language Models [49.1574468325115]
We introduce Speech-IFeval, an evaluation framework designed to assess instruction-following capabilities.<n>Recent SLMs integrate speech perception with large language models (LLMs), often degrading textual capabilities due to speech-centric training.<n>Our findings show that most SLMs struggle with even basic instructions, performing far worse than text-based LLMs.
arXiv Detail & Related papers (2025-05-25T08:37:55Z)
Inclusivity of AI Speech in Healthcare: A Decade Look Back [0.0]
The integration of AI speech recognition technologies into healthcare has the potential to revolutionize clinical and patient-provider communication.<n>However, this study reveals significant gaps in inclusivity, with datasets and research disproportionately favouring high-resource languages, standardized accents, and narrow demographic groups.<n>This paper highlights the urgent need for inclusive dataset design, bias mitigation research, and policy frameworks to ensure equitable access to AI speech technologies in healthcare.
arXiv Detail & Related papers (2025-05-15T10:03:05Z)
Building A Unified AI-centric Language System: analysis, framework and future work [0.0]
This paper explores the design of a unified AI-centric language system. We propose a framework that translates diverse natural language inputs into a streamlined AI-friendly language.
arXiv Detail & Related papers (2025-02-06T20:32:57Z)
IOLBENCH: Benchmarking LLMs on Linguistic Reasoning [8.20398036986024]
We introduce IOLBENCH, a novel benchmark derived from International Linguistics Olympiad (IOL) problems.<n>This dataset encompasses diverse problems testing syntax, morphology, phonology, and semantics.<n>We find that even the most advanced models struggle to handle the intricacies of linguistic complexity.
arXiv Detail & Related papers (2025-01-08T03:15:10Z)
Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease [52.46922921214341]
Alzheimer's disease (AD) has become one of the most significant health challenges in an aging society. We devised an explainable and effective feature set that leverages the visual capabilities of a large language model (LLM) and the Term Frequency-Inverse Document Frequency (TF-IDF) model. Our new features can be well explained and interpreted step by step which enhance the interpretability of automatic AD screening.
arXiv Detail & Related papers (2024-11-28T05:23:22Z)
A Tutorial on Clinical Speech AI Development: From Data Collection to Model Validation [19.367198670893778]
This tutorial paper provides an overview of the key components required for robust development of clinical speech AI. The goal is to provide comprehensive guidance on building models whose inputs and outputs link to the more interpretable and clinically meaningful aspects of speech.
arXiv Detail & Related papers (2024-10-29T00:58:15Z)
A Survey on Lexical Ambiguity Detection and Word Sense Disambiguation [0.0]
This paper explores techniques that focus on understanding and resolving ambiguity in language within the field of natural language processing (NLP) It outlines diverse approaches ranging from deep learning techniques to leveraging lexical resources and knowledge graphs like WordNet. The research identifies persistent challenges in the field, such as the scarcity of sense annotated corpora and the complexity of informal clinical texts.
arXiv Detail & Related papers (2024-03-24T12:58:48Z)
Igniting Language Intelligence: The Hitchhiker's Guide From Chain-of-Thought Reasoning to Language Agents [80.5213198675411]
Large language models (LLMs) have dramatically enhanced the field of language intelligence. LLMs leverage the intriguing chain-of-thought (CoT) reasoning techniques, obliging them to formulate intermediate steps en route to deriving an answer. Recent research endeavors have extended CoT reasoning methodologies to nurture the development of autonomous language agents.
arXiv Detail & Related papers (2023-11-20T14:30:55Z)
Are Structural Concepts Universal in Transformer Language Models? Towards Interpretable Cross-Lingual Generalization [27.368684663279463]
We investigate the potential for explicitly aligning conceptual correspondence between languages to enhance cross-lingual generalization. Using the syntactic aspect of language as a testbed, our analyses of 43 languages reveal a high degree of alignability. We propose a meta-learning-based method to learn to align conceptual spaces of different languages.
arXiv Detail & Related papers (2023-10-19T14:50:51Z)
Rethinking the Evaluating Framework for Natural Language Understanding in AI Systems: Language Acquisition as a Core for Future Metrics [0.0]
In the burgeoning field of artificial intelligence (AI), the unprecedented progress of large language models (LLMs) in natural language processing (NLP) offers an opportunity to revisit the entire approach of traditional metrics of machine intelligence. Our paper proposes a paradigm shift from the established Turing Test towards an all-embracing framework that hinges on language acquisition.
arXiv Detail & Related papers (2023-09-21T11:34:52Z)
Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling [70.23876429382969]
We propose a benchmark that can evaluate intra-sentence discourse properties across a diverse set of NLP tasks. Disco-Bench consists of 9 document-level testsets in the literature domain, which contain rich discourse phenomena. For linguistic analysis, we also design a diagnostic test suite that can examine whether the target models learn discourse knowledge.
arXiv Detail & Related papers (2023-07-16T15:18:25Z)
BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels. We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels. We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z)
SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction [47.88887327545667]
In this study, a syntax-augmented hierarchical interactive encoder (SHINE) is proposed to transfer cross-lingual IE knowledge. SHINE is capable of interactively capturing complementary information between features and contextual information. Experiments across seven languages on three IE tasks and four benchmarks verify the effectiveness and generalization ability of the proposed method.
arXiv Detail & Related papers (2023-05-21T08:02:06Z)
AM2iCo: Evaluating Word Meaning in Context across Low-ResourceLanguages with Adversarial Examples [51.048234591165155]
We present AM2iCo, Adversarial and Multilingual Meaning in Context. It aims to faithfully assess the ability of state-of-the-art (SotA) representation models to understand the identity of word meaning in cross-lingual contexts. Results reveal that current SotA pretrained encoders substantially lag behind human performance.
arXiv Detail & Related papers (2021-04-17T20:23:45Z)
Benchmarking Automated Clinical Language Simplification: Dataset, Algorithm, and Evaluation [48.87254340298189]
We construct a new dataset named MedLane to support the development and evaluation of automated clinical language simplification approaches. We propose a new model called DECLARE that follows the human annotation procedure and achieves state-of-the-art performance.
arXiv Detail & Related papers (2020-12-04T06:09:02Z)
Distributed Linguistic Representations in Decision Making: Taxonomy, Key Elements and Applications, and Challenges in Data Science and Explainable Artificial Intelligence [26.908909011805502]
We present the taxonomy of existing distributed linguistic representations. We review the key elements of distributed linguistic information processing in decision making. Next, we provide a discussion on ongoing challenges and future research directions from the perspective of data science and explainable artificial intelligence.
arXiv Detail & Related papers (2020-08-04T13:13:59Z)
Semantics-Aware Inferential Network for Natural Language Understanding [79.70497178043368]
We propose a Semantics-Aware Inferential Network (SAIN) to meet such a motivation. Taking explicit contextualized semantics as a complementary input, the inferential module of SAIN enables a series of reasoning steps over semantic clues. Our model achieves significant improvement on 11 tasks including machine reading comprehension and natural language inference.
arXiv Detail & Related papers (2020-04-28T07:24:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.