Inspecting Spoken Language Understanding from Kids for Basic Math
Learning at Home
- URL: http://arxiv.org/abs/2306.00482v1
- Date: Thu, 1 Jun 2023 09:31:57 GMT
- Title: Inspecting Spoken Language Understanding from Kids for Basic Math
Learning at Home
- Authors: Eda Okur, Roddy Fuentes Alba, Saurav Sahay, Lama Nachman
- Abstract summary: This work explores Spoken Language Understanding (SLU) pipeline within a task-oriented dialogue system developed for Kid Space.
Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU) components evaluated on our home deployment data.
- Score: 8.819665252533104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Enriching the quality of early childhood education with interactive math
learning at home systems, empowered by recent advances in conversational AI
technologies, is slowly becoming a reality. With this motivation, we implement
a multimodal dialogue system to support play-based learning experiences at
home, guiding kids to master basic math concepts. This work explores Spoken
Language Understanding (SLU) pipeline within a task-oriented dialogue system
developed for Kid Space, with cascading Automatic Speech Recognition (ASR) and
Natural Language Understanding (NLU) components evaluated on our home
deployment data with kids going through gamified math learning activities. We
validate the advantages of a multi-task architecture for NLU and experiment
with a diverse set of pretrained language representations for Intent
Recognition and Entity Extraction tasks in the math learning domain. To
recognize kids' speech in realistic home environments, we investigate several
ASR systems, including the commercial Google Cloud and the latest open-source
Whisper solutions with varying model sizes. We evaluate the SLU pipeline by
testing our best-performing NLU models on noisy ASR output to inspect the
challenges of understanding children for math learning in authentic homes.
Related papers
- Learning to Ground VLMs without Forgetting [54.033346088090674]
We introduce LynX, a framework that equips pretrained Visual Language Models with visual grounding ability without forgetting their existing image and language understanding skills.
To train the model effectively, we generate a high-quality synthetic dataset we call SCouT, which mimics human reasoning in visual grounding.
We evaluate LynX on several object detection and visual grounding datasets, demonstrating strong performance in object detection, zero-shot localization and grounded reasoning.
arXiv Detail & Related papers (2024-10-14T13:35:47Z) - Scaffolding Language Learning via Multi-modal Tutoring Systems with Pedagogical Instructions [34.760230622675365]
Intelligent tutoring systems (ITSs) imitate human tutors and aim to provide customized instructions or feedback to learners.
With the emergence of generative artificial intelligence, large language models (LLMs) entitle the systems to complex and coherent conversational interactions.
We investigate how pedagogical instructions facilitate the scaffolding in ITSs, by conducting a case study on guiding children to describe images for language learning.
arXiv Detail & Related papers (2024-04-04T13:22:28Z) - Reasoning in Conversation: Solving Subjective Tasks through Dialogue
Simulation for Large Language Models [56.93074140619464]
We propose RiC (Reasoning in Conversation), a method that focuses on solving subjective tasks through dialogue simulation.
The motivation of RiC is to mine useful contextual information by simulating dialogues instead of supplying chain-of-thought style rationales.
We evaluate both API-based and open-source LLMs including GPT-4, ChatGPT, and OpenChat across twelve tasks.
arXiv Detail & Related papers (2024-02-27T05:37:10Z) - Towards ASR Robust Spoken Language Understanding Through In-Context
Learning With Word Confusion Networks [68.79880423713597]
We introduce a method that utilizes the ASR system's lattice output instead of relying solely on the top hypothesis.
Our in-context learning experiments, covering spoken question answering and intent classification, underline the LLM's resilience to noisy speech transcripts.
arXiv Detail & Related papers (2024-01-05T17:58:10Z) - Spoken Language Intelligence of Large Language Models for Language
Learning [3.5924382852350902]
We focus on evaluating the efficacy of large language models (LLMs) in the realm of education.
We introduce a new multiple-choice question dataset to evaluate the effectiveness of LLMs in the aforementioned scenarios.
We also investigate the influence of various prompting techniques such as zero- and few-shot method.
We find that models of different sizes have good understanding of concepts in phonetics, phonology, and second language acquisition, but show limitations in reasoning for real-world problems.
arXiv Detail & Related papers (2023-08-28T12:47:41Z) - Brain in a Vat: On Missing Pieces Towards Artificial General
Intelligence in Large Language Models [83.63242931107638]
We propose four characteristics of generally intelligent agents.
We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations.
We conclude by outlining promising future research directions in the field of artificial general intelligence.
arXiv Detail & Related papers (2023-07-07T13:58:16Z) - Computational Language Acquisition with Theory of Mind [84.2267302901888]
We build language-learning agents equipped with Theory of Mind (ToM) and measure its effects on the learning process.
We find that training speakers with a highly weighted ToM listener component leads to performance gains in our image referential game setting.
arXiv Detail & Related papers (2023-03-02T18:59:46Z) - End-to-End Evaluation of a Spoken Dialogue System for Learning Basic
Mathematics [8.819665252533104]
This work presents a task-oriented Spoken Dialogue System (SDS) built to support play-based learning of basic math concepts for early childhood education.
The system has been evaluated via real-world deployments at school while the students are practicing early math concepts with multimodal interactions.
arXiv Detail & Related papers (2022-11-07T12:58:24Z) - Data Augmentation with Paraphrase Generation and Entity Extraction for
Multimodal Dialogue System [9.912419882236918]
We are working towards a multimodal dialogue system for younger kids learning basic math concepts.
This work explores the potential benefits of data augmentation with paraphrase generation for the Natural Language Understanding module of the Spoken Dialogue Systems pipeline.
We have shown that paraphrasing with model-in-the-loop (MITL) strategies using small seed data is a promising approach yielding improved performance results for the Intent Recognition task.
arXiv Detail & Related papers (2022-05-09T02:21:20Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z) - Zero-Shot Compositional Policy Learning via Language Grounding [13.45138913186308]
Humans can adapt to new tasks quickly by leveraging prior knowledge about the world such as language descriptions.
We introduce a new research platform BabyAI++ in which the dynamics of environments are disentangled from visual appearance.
We find that current language-guided RL/IL techniques overfit to the training environments and suffer from a huge performance drop when facing unseen combinations.
arXiv Detail & Related papers (2020-04-15T16:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.