Understanding Spoken Language Development of Children with ASD Using
Pre-trained Speech Embeddings
- URL: http://arxiv.org/abs/2305.14117v2
- Date: Wed, 31 May 2023 22:32:33 GMT
- Title: Understanding Spoken Language Development of Children with ASD Using
Pre-trained Speech Embeddings
- Authors: Anfeng Xu, Rajat Hebbar, Rimita Lahiri, Tiantian Feng, Lindsay Butler,
Lue Shen, Helen Tager-Flusberg, Shrikanth Narayanan
- Abstract summary: Natural Language Sample (NLS) analysis has gained attention as a promising complement to traditional methods.
This paper proposes applications of speech processing technologies in support of automated assessment of children's spoken language development.
- Score: 26.703275678213135
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Speech processing techniques are useful for analyzing speech and language
development in children with Autism Spectrum Disorder (ASD), who are often
varied and delayed in acquiring these skills. Early identification and
intervention are crucial, but traditional assessment methodologies such as
caregiver reports are not adequate for the requisite behavioral phenotyping.
Natural Language Sample (NLS) analysis has gained attention as a promising
complement. Researchers have developed benchmarks for spoken language
capabilities in children with ASD, obtainable through the analysis of NLS. This
paper proposes applications of speech processing technologies in support of
automated assessment of children's spoken language development by
classification between child and adult speech and between speech and nonverbal
vocalization in NLS, with respective F1 macro scores of 82.6% and 67.8%,
underscoring the potential for accurate and scalable tools for ASD research and
clinical use.
Related papers
- Automatic Screening for Children with Speech Disorder using Automatic Speech Recognition: Opportunities and Challenges [15.727507607538874]
Speech and language assessments (SLA) have been conducted by skilled speech-language pathologists (SLPs)
There is a growing need for efficient and scalable SLA methods powered by artificial intelligence.
arXiv Detail & Related papers (2024-10-07T20:14:37Z) - Personalized Speech Recognition for Children with Test-Time Adaptation [21.882608966462932]
Off-the-shelf automatic speech recognition (ASR) models primarily pre-trained on adult data tend to generalize poorly to children's speech.
We devised a novel ASR pipeline to apply unsupervised test-time adaptation (TTA) methods for child speech recognition.
Our results show that ASR models adapted with TTA methods significantly outperform the unadapted off-the-shelf ASR baselines both on average and statistically across individual child speakers.
arXiv Detail & Related papers (2024-09-19T21:40:07Z) - Developing an End-to-End Framework for Predicting the Social Communication Severity Scores of Children with Autism Spectrum Disorder [6.197934754799159]
This paper proposes an end-to-end framework for automatically predicting the social communication severity of children with ASD from raw speech data.
Achieving a Pearson Correlation Coefficient of 0.6566 with human-rated scores, the proposed method showcases its potential as an accessible and objective tool for the assessment of ASD.
arXiv Detail & Related papers (2024-08-30T14:43:58Z) - Age-Dependent Analysis and Stochastic Generation of Child-Directed Speech [10.369750912567714]
We present an approach to model age-dependent linguistic properties of child-directed speech (CDS) using a language model (LM) trained on CDS transcripts and ages of the recipient children.
We compare characteristics of the generated CDS against the real speech addressed at children of different ages, showing that the LM manages to capture age-dependent changes in CDS.
arXiv Detail & Related papers (2024-05-13T12:35:10Z) - Speech Corpus for Korean Children with Autism Spectrum Disorder: Towards
Automatic Assessment Systems [7.153773998764661]
This paper introduces a speech corpus specifically designed for Korean children with ASD.
Three speech and language pathologists rated recordings for social communication severity (SCS) and pronunciation proficiency (PP) using a 3-point Likert scale.
The paper also analyzes acoustic and linguistic features extracted from speech data collected and completed for annotation from 73 children with ASD and 9 TD children.
arXiv Detail & Related papers (2024-02-23T07:32:54Z) - BabySLM: language-acquisition-friendly benchmark of self-supervised
spoken language models [56.93604813379634]
Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels.
We propose a language-acquisition-friendly benchmark to probe spoken language models at the lexical and syntactic levels.
We highlight two exciting challenges that need to be addressed for further progress: bridging the gap between text and speech and between clean speech and in-the-wild speech.
arXiv Detail & Related papers (2023-06-02T12:54:38Z) - Analysing the Impact of Audio Quality on the Use of Naturalistic
Long-Form Recordings for Infant-Directed Speech Research [62.997667081978825]
Modelling of early language acquisition aims to understand how infants bootstrap their language skills.
Recent developments have enabled the use of more naturalistic training data for computational models.
It is currently unclear how the sound quality could affect analyses and modelling experiments conducted on such data.
arXiv Detail & Related papers (2023-05-03T08:25:37Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - HASA-net: A non-intrusive hearing-aid speech assessment network [52.83357278948373]
We propose a DNN-based hearing aid speech assessment network (HASA-Net) to predict speech quality and intelligibility scores simultaneously.
To the best of our knowledge, HASA-Net is the first work to incorporate quality and intelligibility assessments utilizing a unified DNN-based non-intrusive model for hearing aids.
Experimental results show that the predicted speech quality and intelligibility scores of HASA-Net are highly correlated to two well-known intrusive hearing-aid evaluation metrics.
arXiv Detail & Related papers (2021-11-10T14:10:13Z) - Leveraging Pre-trained Language Model for Speech Sentiment Analysis [58.78839114092951]
We explore the use of pre-trained language models to learn sentiment information of written texts for speech sentiment analysis.
We propose a pseudo label-based semi-supervised training strategy using a language model on an end-to-end speech sentiment approach.
arXiv Detail & Related papers (2021-06-11T20:15:21Z) - NUVA: A Naming Utterance Verifier for Aphasia Treatment [49.114436579008476]
Assessment of speech performance using picture naming tasks is a key method for both diagnosis and monitoring of responses to treatment interventions by people with aphasia (PWA)
Here we present NUVA, an utterance verification system incorporating a deep learning element that classifies 'correct' versus'incorrect' naming attempts from aphasic stroke patients.
When tested on eight native British-English speaking PWA the system's performance accuracy ranged between 83.6% to 93.6%, with a 10-fold cross-validation mean of 89.5%.
arXiv Detail & Related papers (2021-02-10T13:00:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.