Automatically assessing oral narratives of Afrikaans and isiXhosa children
- URL: http://arxiv.org/abs/2507.13205v2
- Date: Fri, 18 Jul 2025 07:49:41 GMT
- Title: Automatically assessing oral narratives of Afrikaans and isiXhosa children
- Authors: Retief Louw, Emma Sharratt, Febe de Wet, Christiaan Jacobs, Annelien Smith, Herman Kamper,
- Abstract summary: We present a system for automatically assessing oral narratives of preschool children in Afrikaans and isiXhosa.<n>The system uses automatic speech recognition followed by a machine learning scoring model to predict narrative and comprehension scores.
- Score: 15.669164862460342
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Developing narrative and comprehension skills in early childhood is critical for later literacy. However, teachers in large preschool classrooms struggle to accurately identify students who require intervention. We present a system for automatically assessing oral narratives of preschool children in Afrikaans and isiXhosa. The system uses automatic speech recognition followed by a machine learning scoring model to predict narrative and comprehension scores. For scoring predicted transcripts, we compare a linear model to a large language model (LLM). The LLM-based system outperforms the linear model in most cases, but the linear system is competitive despite its simplicity. The LLM-based system is comparable to a human expert in flagging children who require intervention. We lay the foundation for automatic oral assessments in classrooms, giving teachers extra capacity to focus on personalised support for children's learning.
Related papers
- Feature-based analysis of oral narratives from Afrikaans and isiXhosa children [14.74555743937968]
We analyse recorded stories from four- and five-year-old Afrikaans- and isiXhosa-speaking children.<n>We identify lexical diversity (unique words) and length-based features (mean utterance length) as indicators of typical development.<n>The use of specific verbs and auxiliaries associated with goal-directed storytelling is correlated with a reduced likelihood of requiring intervention.
arXiv Detail & Related papers (2025-07-17T14:31:32Z) - Our Coding Adventure: Using LLMs to Personalise the Narrative of a Tangible Programming Robot for Preschoolers [0.5099081649205313]
We develop an early version of a formalised process to rapidly prototype game stories for Cubetto.<n>We document on one hand the process, the used materials and prompts, and on the other the learning experience and outcomes.<n>We believe our method is adequate for preschool classes and we are planning to further experiment in real-world educational settings.
arXiv Detail & Related papers (2025-06-26T03:54:25Z) - An End-to-End Approach for Child Reading Assessment in the Xhosa Language [0.3579433677269426]
This study focuses on Xhosa, a language spoken in South Africa, to advance child speech recognition capabilities.<n>We present a novel dataset composed of child speech samples in Xhosa.<n>The results indicate that the performance of these models can be significantly influenced by the amount and balancing of the available training data.
arXiv Detail & Related papers (2025-05-23T00:59:58Z) - SCOPE: A Self-supervised Framework for Improving Faithfulness in Conditional Text Generation [55.61004653386632]
Large Language Models (LLMs) often produce hallucinations, i.e., information that is unfaithful or not grounded in the input context.<n>This paper introduces a novel self-supervised method for generating a training set of unfaithful samples.<n>We then refine the model using a training process that encourages the generation of grounded outputs over unfaithful ones.
arXiv Detail & Related papers (2025-02-19T12:31:58Z) - Speech Recognition for Automatically Assessing Afrikaans and isiXhosa Preschool Oral Narratives [15.669164862460342]
We develop automatic speech recognition systems for stories told by Afrikaans and isiXhosa preschool children.<n>We consider a range of prior child-speech ASR strategies to determine which is best suited to this unique setting.
arXiv Detail & Related papers (2025-01-11T08:11:09Z) - Developmental Predictive Coding Model for Early Infancy Mono and Bilingual Vocal Continual Learning [69.8008228833895]
We propose a small-sized generative neural network equipped with a continual learning mechanism.<n>Our model prioritizes interpretability and demonstrates the advantages of online learning.
arXiv Detail & Related papers (2024-12-23T10:23:47Z) - Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks [68.33068005789116]
We introduce ReDial, a benchmark containing 1.2K+ parallel query pairs in Standardized English and AAVE.<n>We evaluate widely used models, including GPT, Claude, Llama, Mistral, and the Phi model families.<n>Our work establishes a systematic and objective framework for analyzing LLM bias in dialectal queries.
arXiv Detail & Related papers (2024-10-14T18:44:23Z) - CourseAssist: Pedagogically Appropriate AI Tutor for Computer Science Education [1.052788652996288]
This poster introduces CourseAssist, a novel LLM-based tutoring system tailored for computer science education.
Unlike generic LLM systems, CourseAssist uses retrieval-augmented generation, user intent classification, and question decomposition to align AI responses with specific course materials and learning objectives.
arXiv Detail & Related papers (2024-05-01T20:43:06Z) - Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions [34.760230622675365]
Intelligent tutoring systems (ITSs) imitate human tutors and aim to provide customized instructions or feedback to learners.
With the emergence of generative artificial intelligence, large language models (LLMs) entitle the systems to complex and coherent conversational interactions.
We investigate how pedagogical instructions facilitate the scaffolding in ITSs, by conducting a case study on guiding children to describe images for language learning.
arXiv Detail & Related papers (2024-04-04T13:22:28Z) - User Adaptive Language Learning Chatbots with a Curriculum [55.63893493019025]
We adapt lexically constrained decoding to a dialog system, which urges the dialog system to include curriculum-aligned words and phrases in its generated utterances.
The evaluation result demonstrates that the dialog system with curriculum infusion improves students' understanding of target words and increases their interest in practicing English.
arXiv Detail & Related papers (2023-04-11T20:41:41Z) - Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic
Dataset for Narrative Comprehension [136.82507046638784]
We introduce FairytaleQA, a dataset focusing on narrative comprehension of kindergarten to eighth-grade students.
FairytaleQA consists of 10,580 explicit and implicit questions derived from 278 children-friendly stories.
arXiv Detail & Related papers (2022-03-26T00:20:05Z) - My Teacher Thinks The World Is Flat! Interpreting Automatic Essay
Scoring Mechanism [71.34160809068996]
Recent work shows that automated scoring systems are prone to even common-sense adversarial samples.
We utilize recent advances in interpretability to find the extent to which features such as coherence, content and relevance are important for automated scoring mechanisms.
We also find that since the models are not semantically grounded with world-knowledge and common sense, adding false facts such as the world is flat'' actually increases the score instead of decreasing it.
arXiv Detail & Related papers (2020-12-27T06:19:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.