FairytaleCQA: Integrating a Commonsense Knowledge Graph into Children's
Storybook Narratives
- URL: http://arxiv.org/abs/2311.09756v1
- Date: Thu, 16 Nov 2023 10:30:26 GMT
- Title: FairytaleCQA: Integrating a Commonsense Knowledge Graph into Children's
Storybook Narratives
- Authors: Jiaju Chen, Yuxuan Lu, Shao Zhang, Bingsheng Yao, Yuanzhe Dong, Ying
Xu, Yunyao Li, Qianwen Wang, Dakuo Wang, Yuling Sun
- Abstract summary: We introduce the FairytaleCQA dataset to supplement 278 storybook narratives with educationally appropriate commonsense knowledge.
The dataset has 5,868 QA pairs that not only originate from the storybook narrative but also contain the commonsense knowledge grounded by an external knowledge graph.
- Score: 37.37125094937394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: AI models (including LLM) often rely on narrative question-answering (QA)
datasets to provide customized QA functionalities to support downstream
children education applications; however, existing datasets only include QA
pairs that are grounded within the given storybook content, but children can
learn more when teachers refer the storybook content to real-world knowledge
(e.g., commonsense knowledge). We introduce the FairytaleCQA dataset, which is
annotated by children education experts, to supplement 278 storybook narratives
with educationally appropriate commonsense knowledge. The dataset has 5,868 QA
pairs that not only originate from the storybook narrative but also contain the
commonsense knowledge grounded by an external knowledge graph (i.e.,
ConceptNet). A follow-up experiment shows that a smaller model (T5-large)
fine-tuned with FairytaleCQA reliably outperforms much larger prompt-engineered
LLM (e.g., GPT-4) in this new QA-pair generation task (QAG). This result
suggests that: 1) our dataset brings novel challenges to existing LLMs, and 2)
human experts' data annotation are still critical as they have much nuanced
knowledge that LLMs do not know in the children educational domain.
Related papers
- UDKAG: Augmenting Large Vision-Language Models with Up-to-Date Knowledge [56.772051051558215]
Large vision-language models (LVLMs) are ignorant of the up-to-date knowledge, such as LLaVA series, because they cannot be updated frequently.
A promising solution is to provide LVLMs with up-to-date knowledge via internet search during inference, i.e., internet-augmented generation (IAG)
We propose a plug-and-play framework, for augmenting existing LVLMs in handling visual question answering (VQA) about up-to-date knowledge, dubbed UDKAG.
arXiv Detail & Related papers (2024-05-23T13:32:07Z) - Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts [50.06633829833144]
Large Language Models (LLMs) are effective in performing various NLP tasks, but struggle to handle tasks that require extensive, real-world knowledge.
We propose a benchmark that requires knowledge of long-tail facts for answering the involved questions.
Our experiments show that LLMs alone struggle with answering these questions, especially when the long-tail level is high or rich knowledge is required.
arXiv Detail & Related papers (2024-05-10T15:10:20Z) - DIVKNOWQA: Assessing the Reasoning Ability of LLMs via Open-Domain
Question Answering over Knowledge Base and Text [73.68051228972024]
Large Language Models (LLMs) have exhibited impressive generation capabilities, but they suffer from hallucinations when relying on their internal knowledge.
Retrieval-augmented LLMs have emerged as a potential solution to ground LLMs in external knowledge.
arXiv Detail & Related papers (2023-10-31T04:37:57Z) - Knowledge Solver: Teaching LLMs to Search for Domain Knowledge from
Knowledge Graphs [19.0797968186656]
Large language models (LLMs) are versatile and can solve different tasks due to their emergent ability and generalizability.
In some previous works, additional modules like graph neural networks (GNNs) are trained on retrieved knowledge from external knowledge bases.
arXiv Detail & Related papers (2023-09-06T15:55:01Z) - Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge
Graph Question Answering [7.888547093390469]
Large Language Models (LLMs) are capable of performing zero-shot closed-book question answering tasks.
We propose to augment the knowledge directly in the input of LLMs.
Our framework, Knowledge-Augmented language model PromptING (KAPING), requires no model training, thus completely zero-shot.
arXiv Detail & Related papers (2023-06-07T04:15:21Z) - Language Models are Causal Knowledge Extractors for Zero-shot Video
Question Answering [60.93164850492871]
Causal Video Question Answering (CVidQA) queries not only association or temporal relations but also causal relations in a video.
We propose a novel framework, Causal Knowledge Extraction from Language Models (CaKE-LM), leveraging causal commonsense knowledge from language models to tackle CVidQA.
CaKE-LM significantly outperforms conventional methods by 4% to 6% of zero-shot CVidQA accuracy on NExT-QA and Causal-VidQA datasets.
arXiv Detail & Related papers (2023-04-07T17:45:49Z) - Fantastic Questions and Where to Find Them: FairytaleQA -- An Authentic
Dataset for Narrative Comprehension [136.82507046638784]
We introduce FairytaleQA, a dataset focusing on narrative comprehension of kindergarten to eighth-grade students.
FairytaleQA consists of 10,580 explicit and implicit questions derived from 278 children-friendly stories.
arXiv Detail & Related papers (2022-03-26T00:20:05Z) - It is AI's Turn to Ask Human a Question: Question and Answer Pair
Generation for Children Storybooks in FairytaleQA Dataset [30.557699346777582]
In educational applications, teachers and parents sometimes may not know what questions they should ask a child that can maximize their language learning results.
With a newly released book QA dataset (FairytaleQA), we developed an automated QA generation model architecture for this novel application.
arXiv Detail & Related papers (2021-09-08T04:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.