Book2Dial: Generating Teacher-Student Interactions from Textbooks for
Cost-Effective Development of Educational Chatbots
- URL: http://arxiv.org/abs/2403.03307v1
- Date: Tue, 5 Mar 2024 20:12:05 GMT
- Title: Book2Dial: Generating Teacher-Student Interactions from Textbooks for
Cost-Effective Development of Educational Chatbots
- Authors: Junling Wang, Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury,
Mrinmaya Sachan
- Abstract summary: We propose a framework for generating synthetic teacher-student interactions grounded in a set of textbooks.
We highlight various quality criteria that such dialogues should fulfill and compare several approaches relying on either prompting or fine-tuning large language models.
Our findings offer insights for future efforts in synthesizing conversational data that strikes a balance between size and quality.
- Score: 37.304476231479725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Educational chatbots are a promising tool for assisting student learning.
However, the development of effective chatbots in education has been
challenging, as high-quality data is seldom available in this domain. In this
paper, we propose a framework for generating synthetic teacher-student
interactions grounded in a set of textbooks. Our approaches capture one aspect
of learning interactions where curious students with partial knowledge
interactively ask a teacher questions about the material in the textbook. We
highlight various quality criteria that such dialogues should fulfill and
compare several approaches relying on either prompting or fine-tuning large
language models. We use synthetic dialogues to train educational chatbots and
show benefits of further fine-tuning in different educational domains. However,
human evaluation shows that our best data synthesis method still suffers from
hallucinations and tends to reiterate information from previous conversations.
Our findings offer insights for future efforts in synthesizing conversational
data that strikes a balance between size and quality. We will open-source our
data and code.
Related papers
- Conversations as a Source for Teaching Scientific Concepts at Different Education Levels [22.315652391541285]
This paper presents a novel source for facilitating conversational teaching of scientific concepts at various difficulty levels.
We analyse this data source in various ways to show that it offers a diverse array of examples that can be used to generate contextually appropriate responses.
arXiv Detail & Related papers (2024-04-16T11:33:36Z) - Learning From Free-Text Human Feedback -- Collect New Datasets Or Extend
Existing Ones? [57.16050211534735]
We investigate the types and frequency of free-text human feedback in commonly used dialog datasets.
Our findings provide new insights into the composition of the datasets examined, including error types, user response types, and the relations between them.
arXiv Detail & Related papers (2023-10-24T12:01:11Z) - Curriculum-Driven Edubot: A Framework for Developing Language Learning Chatbots Through Synthesizing Conversational Data [23.168347070904318]
We present Curriculum-Driven EduBot, a framework for developing a chatbots that combines the interactive features of chatbots with the systematic material of English textbooks.
We begin by extracting pertinent topics from textbooks and using large language models to generate dialogues related to these topics.
arXiv Detail & Related papers (2023-09-28T19:14:18Z) - AutoConv: Automatically Generating Information-seeking Conversations
with Large Language Models [74.10293412011455]
We propose AutoConv for synthetic conversation generation.
Specifically, we formulate the conversation generation problem as a language modeling task.
We finetune an LLM with a few human conversations to capture the characteristics of the information-seeking process.
arXiv Detail & Related papers (2023-08-12T08:52:40Z) - Developing Effective Educational Chatbots with ChatGPT prompts: Insights
from Preliminary Tests in a Case Study on Social Media Literacy (with
appendix) [43.55994393060723]
Recent advances in language learning models with zero-shot learning capabilities, such as ChatGPT, suggest a new possibility for developing educational chatbots.
We present a case study with a simple system that enables mixed-turn chatbots interactions.
We examine ChatGPT's ability to pursue multiple interconnected learning objectives, adapt the educational activity to users' characteristics, such as culture, age, and level of education, and its ability to use diverse educational strategies and conversational styles.
arXiv Detail & Related papers (2023-06-18T22:23:18Z) - PLACES: Prompting Language Models for Social Conversation Synthesis [103.94325597273316]
We use a small set of expert-written conversations as in-context examples to synthesize a social conversation dataset using prompting.
We perform several thorough evaluations of our synthetic conversations compared to human-collected conversations.
arXiv Detail & Related papers (2023-02-07T05:48:16Z) - Opportunities and Challenges in Neural Dialog Tutoring [54.07241332881601]
We rigorously analyze various generative language models on two dialog tutoring datasets for language learning.
We find that although current approaches can model tutoring in constrained learning scenarios, they perform poorly in less constrained scenarios.
Our human quality evaluation shows that both models and ground-truth annotations exhibit low performance in terms of equitable tutoring.
arXiv Detail & Related papers (2023-01-24T11:00:17Z) - Few-Shot Bot: Prompt-Based Learning for Dialogue Systems [58.27337673451943]
Learning to converse using only a few examples is a great challenge in conversational AI.
The current best conversational models are either good chit-chatters (e.g., BlenderBot) or goal-oriented systems (e.g., MinTL)
We propose prompt-based few-shot learning which does not require gradient-based fine-tuning but instead uses a few examples as the only source of learning.
arXiv Detail & Related papers (2021-10-15T14:36:45Z) - A Taxonomy of Empathetic Response Intents in Human Social Conversations [1.52292571922932]
Open-domain conversational agents are becoming increasingly popular in the natural language processing community.
One of the challenges is enabling them to converse in an empathetic manner.
Current neural response generation methods rely solely on end-to-end learning from large scale conversation data to generate dialogues.
Recent work has shown the promise of combining dialogue act/intent modelling and neural response generation.
arXiv Detail & Related papers (2020-12-07T21:56:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.