Deep Learning for Assessment of Oral Reading Fluency
- URL: http://arxiv.org/abs/2405.19426v2
- Date: Sat, 1 Jun 2024 14:16:42 GMT
- Title: Deep Learning for Assessment of Oral Reading Fluency
- Authors: Mithilesh Vaidya, Binaya Kumar Sahoo, Preeti Rao,
- Abstract summary: This work investigates end-to-end modeling on a training dataset of children's audio recordings of story texts labeled by human experts.
We report the performance of a number of system variations on the relevant measures, and probe the learned embeddings for lexical and acoustic-prosodic features known to be important to the perception of reading fluency.
- Score: 5.707725771108279
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Reading fluency assessment is a critical component of literacy programmes, serving to guide and monitor early education interventions. Given the resource intensive nature of the exercise when conducted by teachers, the development of automatic tools that can operate on audio recordings of oral reading is attractive as an objective and highly scalable solution. Multiple complex aspects such as accuracy, rate and expressiveness underlie human judgements of reading fluency. In this work, we investigate end-to-end modeling on a training dataset of children's audio recordings of story texts labeled by human experts. The pre-trained wav2vec2.0 model is adopted due its potential to alleviate the challenges from the limited amount of labeled data. We report the performance of a number of system variations on the relevant measures, and also probe the learned embeddings for lexical and acoustic-prosodic features known to be important to the perception of reading fluency.
Related papers
- Multi-Modal Self-Supervised Learning for Surgical Feedback Effectiveness Assessment [66.6041949490137]
We propose a method that integrates information from transcribed verbal feedback and corresponding surgical video to predict feedback effectiveness.
Our findings show that both transcribed feedback and surgical video are individually predictive of trainee behavior changes.
Our results demonstrate the potential of multi-modal learning to advance the automated assessment of surgical feedback.
arXiv Detail & Related papers (2024-11-17T00:13:00Z) - The Promises and Pitfalls of Using Language Models to Measure Instruction Quality in Education [3.967610895056427]
This paper presents the first study that leverages Natural Language Processing (NLP) techniques to assess multiple high-inference instructional practices.
We confront two challenges inherent in NLP-based instructional analysis, including noisy and long input data and highly skewed distributions of human ratings.
arXiv Detail & Related papers (2024-04-03T04:15:29Z) - Blending Reward Functions via Few Expert Demonstrations for Faithful and
Accurate Knowledge-Grounded Dialogue Generation [22.38338205905379]
We leverage reinforcement learning algorithms to overcome the above challenges by introducing a novel reward function.
Our reward function combines an accuracy metric and a faithfulness metric to provide a balanced quality judgment of generated responses.
arXiv Detail & Related papers (2023-11-02T02:42:41Z) - A Survey of the Impact of Self-Supervised Pretraining for Diagnostic
Tasks with Radiological Images [71.26717896083433]
Self-supervised pretraining has been observed to be effective at improving feature representations for transfer learning.
This review summarizes recent research into its usage in X-ray, computed tomography, magnetic resonance, and ultrasound imaging.
arXiv Detail & Related papers (2023-09-05T19:45:09Z) - Phonetic and Prosody-aware Self-supervised Learning Approach for
Non-native Fluency Scoring [13.817385516193445]
Speech fluency/disfluency can be evaluated by analyzing a range of phonetic and prosodic features.
Deep neural networks are commonly trained to map fluency-related features into the human scores.
We introduce a self-supervised learning (SSL) approach that takes into account phonetic and prosody awareness for fluency scoring.
arXiv Detail & Related papers (2023-05-19T05:39:41Z) - A Matter of Annotation: An Empirical Study on In Situ and Self-Recall Activity Annotations from Wearable Sensors [56.554277096170246]
We present an empirical study that evaluates and contrasts four commonly employed annotation methods in user studies focused on in-the-wild data collection.
For both the user-driven, in situ annotations, where participants annotate their activities during the actual recording process, and the recall methods, where participants retrospectively annotate their data at the end of each day, the participants had the flexibility to select their own set of activity classes and corresponding labels.
arXiv Detail & Related papers (2023-05-15T16:02:56Z) - Leveraging Pretrained Representations with Task-related Keywords for
Alzheimer's Disease Detection [69.53626024091076]
Alzheimer's disease (AD) is particularly prominent in older adults.
Recent advances in pre-trained models motivate AD detection modeling to shift from low-level features to high-level representations.
This paper presents several efficient methods to extract better AD-related cues from high-level acoustic and linguistic features.
arXiv Detail & Related papers (2023-03-14T16:03:28Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - Deep Learning for Prominence Detection in Children's Read Speech [13.041607703862724]
We consider a labeled dataset of children's reading recordings for the speaker-independent detection of prominent words.
A previous well-tuned random forest ensemble predictor is replaced by an RNN sequence to exploit potential context dependency.
Deep learning is applied to obtain word-level features from low-level acoustic contours of fundamental frequency, intensity and spectral shape.
arXiv Detail & Related papers (2021-04-12T14:15:08Z) - Confident Coreset for Active Learning in Medical Image Analysis [57.436224561482966]
We propose a novel active learning method, confident coreset, which considers both uncertainty and distribution for effectively selecting informative samples.
By comparative experiments on two medical image analysis tasks, we show that our method outperforms other active learning methods.
arXiv Detail & Related papers (2020-04-05T13:46:16Z) - A Neural Approach to Ordinal Regression for the Preventive Assessment of
Developmental Dyslexia [0.0]
Developmental Dyslexia (DD) is a learning disability related to the acquisition of reading skills that affects about 5% of the population.
We propose a new methodology to assess the risk of DD before students learn to read.
arXiv Detail & Related papers (2020-02-06T10:08:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.