StorySparkQA: Expert-Annotated QA Pairs with Real-World Knowledge for Children's Story-Based Learning
- URL: http://arxiv.org/abs/2311.09756v3
- Date: Fri, 04 Oct 2024 05:39:32 GMT
- Title: StorySparkQA: Expert-Annotated QA Pairs with Real-World Knowledge for Children's Story-Based Learning
- Authors: Jiaju Chen, Yuxuan Lu, Shao Zhang, Bingsheng Yao, Yuanzhe Dong, Ying Xu, Yunyao Li, Qianwen Wang, Dakuo Wang, Yuling Sun,
- Abstract summary: We design an annotation framework, empowered by existing knowledge graph to capture experts' annotations and thinking process.
StorySparkQA dataset comprises 5,868 expert-annotated QA pairs with real-world knowledge.
- Score: 36.16783204588302
- License:
- Abstract: Interactive story reading is a common parent-child activity, where parents expect to teach both language skills and real-world knowledge beyond the story. While increasing storytelling and reading systems have been developed for this activity, they often fail to infuse real-world knowledge into the conversation. This limitation can be attributed to the existing question-answering (QA) datasets used for children's education, upon which the systems are built, failing to capture the nuances of how education experts think when conducting interactive story reading activities. To bridge this gap, we design an annotation framework, empowered by existing knowledge graph to capture experts' annotations and thinking process, and leverage this framework to construct StorySparkQA dataset, which comprises 5,868 expert-annotated QA pairs with real-world knowledge. We conduct automated and human expert evaluations across various QA pair generation settings to demonstrate that our StorySparkQA can effectively support models in generating QA pairs that target real-world knowledge beyond story content. StorySparkQA is available at https://huggingface.co/datasets/NEU-HAI/StorySparkQA.
Related papers
- FairytaleQA Translated: Enabling Educational Question and Answer Generation in Less-Resourced Languages [0.0]
This paper introduces machine-translated versions of FairytaleQA, a renowned QA dataset designed to assess and enhance narrative comprehension skills in young children.
We employ fine-tuned, modest-scale models to establish benchmarks for both Question Generation (QG) and QA tasks within the translated datasets.
We present a case study proposing a model for generating question-answer pairs, with an evaluation incorporating quality metrics such as question well-formedness, answerability, relevance, and children suitability.
arXiv Detail & Related papers (2024-06-06T16:31:47Z) - Exploring Parent's Needs for Children-Centered AI to Support Preschoolers' Interactive Storytelling and Reading Activities [52.828843153565984]
AI-based storytelling and reading technologies are becoming increasingly ubiquitous in preschoolers' lives.
This paper investigates how they function in practical storytelling and reading scenarios and, how parents, the most critical stakeholders, experience and perceive them.
Our findings suggest that even though AI-based storytelling and reading technologies provide more immersive and engaging interaction, they still cannot meet parents' expectations due to a series of interactive and algorithmic challenges.
arXiv Detail & Related papers (2024-01-24T20:55:40Z) - OPERA: Harmonizing Task-Oriented Dialogs and Information Seeking
Experience [87.0233567695073]
Existing studies in conversational AI mostly treat task-oriented dialog (TOD) and question answering (QA) as separate tasks.
We propose a new task, Open-Book TOD (OB-TOD), which combines TOD with QA task and expand external knowledge sources.
We propose a unified model OPERA which can appropriately access explicit and implicit external knowledge to tackle the defined task.
arXiv Detail & Related papers (2022-06-24T18:21:26Z) - Asking for Knowledge: Training RL Agents to Query External Knowledge
Using Language [121.56329458876655]
We introduce two new environments: the grid-world-based Q-BabyAI and the text-based Q-TextWorld.
We propose the "Asking for Knowledge" (AFK) agent, which learns to generate language commands to query for meaningful knowledge.
arXiv Detail & Related papers (2022-05-12T14:20:31Z) - Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic
Dataset for Narrative Comprehension [136.82507046638784]
We introduce FairytaleQA, a dataset focusing on narrative comprehension of kindergarten to eighth-grade students.
FairytaleQA consists of 10,580 explicit and implicit questions derived from 278 children-friendly stories.
arXiv Detail & Related papers (2022-03-26T00:20:05Z) - TegTok: Augmenting Text Generation via Task-specific and Open-world
Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z) - It is AI's Turn to Ask Human a Question: Question and Answer Pair
Generation for Children Storybooks in FairytaleQA Dataset [30.557699346777582]
In educational applications, teachers and parents sometimes may not know what questions they should ask a child that can maximize their language learning results.
With a newly released book QA dataset (FairytaleQA), we developed an automated QA generation model architecture for this novel application.
arXiv Detail & Related papers (2021-09-08T04:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.