Automating Turkish Educational Quiz Generation Using Large Language Models
- URL: http://arxiv.org/abs/2406.03397v1
- Date: Wed, 5 Jun 2024 15:54:50 GMT
- Title: Automating Turkish Educational Quiz Generation Using Large Language Models
- Authors: Kamyar Zeinalipour, Yusuf Gökberk Keptiğ, Marco Maggini, Marco Gori,
- Abstract summary: We present a specialized dataset, named the Turkish-Quiz-Instruct, comprising an extensive collection of Turkish educational texts.
This research leverages the capabilities of Large Language Models (LLMs), including GPT-4-Turbo, GPT-3.5-Turbo, Llama-2-7b-chat-hf, and Llama-2-13b-chat-hf, to automatically generate quiz questions and answers.
- Score: 10.876144855651608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Crafting quizzes from educational content is a pivotal activity that benefits both teachers and students by reinforcing learning and evaluating understanding. In this study, we introduce a novel approach to generate quizzes from Turkish educational texts, marking a pioneering endeavor in educational technology specifically tailored to the Turkish educational context. We present a specialized dataset, named the Turkish-Quiz-Instruct, comprising an extensive collection of Turkish educational texts accompanied by multiple-choice and short-answer quizzes. This research leverages the capabilities of Large Language Models (LLMs), including GPT-4-Turbo, GPT-3.5-Turbo, Llama-2-7b-chat-hf, and Llama-2-13b-chat-hf, to automatically generate quiz questions and answers from the Turkish educational content. Our work delineates the methodology for employing these LLMs in the context of Turkish educational material, thereby opening new avenues for automated Turkish quiz generation. The study not only demonstrates the efficacy of using such models for generating coherent and relevant quiz content but also sets a precedent for future research in the domain of automated educational content creation for languages other than English. The Turkish-Quiz-Instruct dataset is introduced as a valuable resource for researchers and practitioners aiming to explore the boundaries of educational technology and language-specific applications of LLMs in Turkish. By addressing the challenges of quiz generation in a non-English context specifically Turkish, this study contributes significantly to the field of Turkish educational technology, providing insights into the potential of leveraging LLMs for educational purposes across diverse linguistic landscapes.
Related papers
- TurkishMMLU: Measuring Massive Multitask Language Understanding in Turkish [54.51310112013655]
We introduce the first multitask, multiple-choice Turkish QA benchmark, TurkishMMLU.
TurkishMMLU includes over 10,000 questions, covering 9 different subjects from Turkish high-school education curricula.
We evaluate over 20 LLMs, including multilingual open-source (e.g., Gemma, Llama, MT5), closed-source (GPT 4o, Claude, Gemini), and Turkish-adapted (e.g., Trendyol) models.
arXiv Detail & Related papers (2024-07-17T08:28:55Z) - CaLMQA: Exploring culturally specific long-form question answering across 23 languages [58.18984409715615]
CaLMQA is a collection of 1.5K culturally specific questions spanning 23 languages and 51 culturally translated questions from English into 22 other languages.
We collect naturally-occurring questions from community web forums and hire native speakers to write questions to cover under-studied languages such as Fijian and Kirundi.
Our dataset contains diverse, complex questions that reflect cultural topics (e.g. traditions, laws, news) and the language usage of native speakers.
arXiv Detail & Related papers (2024-06-25T17:45:26Z) - A Turkish Educational Crossword Puzzle Generator [10.434753479074814]
This paper introduces the first Turkish crossword puzzle generator designed to leverage the capabilities of large language models (LLMs)
It's a notable step in AI-enhanced education, merging game-like engagement with learning for Turkish and setting new standards for interactive, intelligent learning tools in Turkish.
arXiv Detail & Related papers (2024-05-11T15:18:56Z) - Bridging the Bosphorus: Advancing Turkish Large Language Models through Strategies for Low-Resource Language Adaptation and Benchmarking [1.3716808114696444]
Large Language Models (LLMs) are becoming crucial across various fields, emphasizing the urgency for high-quality models in underrepresented languages.
This study explores the unique challenges faced by low-resource languages, such as data scarcity, model selection, evaluation, and computational limitations.
arXiv Detail & Related papers (2024-05-07T21:58:45Z) - MLaKE: Multilingual Knowledge Editing Benchmark for Large Language Models [65.10456412127405]
MLaKE is a benchmark for the adaptability of knowledge editing methods across five languages.
MLaKE aggregates fact chains from Wikipedia across languages and generates questions in both free-form and multiple-choice.
We evaluate the multilingual knowledge editing generalization capabilities of existing methods on MLaKE.
arXiv Detail & Related papers (2024-04-07T15:23:28Z) - Language Models as Science Tutors [79.73256703631492]
We introduce TutorEval and TutorChat to measure real-life usability of LMs as scientific assistants.
We show that fine-tuning base models with existing dialogue datasets leads to poor performance on TutorEval.
We use TutorChat to fine-tune Llemma models with 7B and 34B parameters. These LM tutors specialized in math have a 32K-token context window, and they excel at TutorEval while performing strongly on GSM8K and MATH.
arXiv Detail & Related papers (2024-02-16T22:24:13Z) - Fine-tuning Transformer-based Encoder for Turkish Language Understanding
Tasks [0.0]
We provide a Transformer-based model and a baseline benchmark for the Turkish Language.
We successfully fine-tuned a Turkish BERT model, namely BERTurk, to many downstream tasks and evaluated with a the Turkish Benchmark dataset.
arXiv Detail & Related papers (2024-01-30T19:27:04Z) - Teacher Perception of Automatically Extracted Grammar Concepts for L2
Language Learning [66.79173000135717]
We apply this work to teaching two Indian languages, Kannada and Marathi, which do not have well-developed resources for second language learning.
We extract descriptions from a natural text corpus that answer questions about morphosyntax (learning of word order, agreement, case marking, or word formation) and semantics (learning of vocabulary).
We enlist the help of language educators from schools in North America to perform a manual evaluation, who find the materials have potential to be used for their lesson preparation and learner evaluation.
arXiv Detail & Related papers (2023-10-27T18:17:29Z) - Teacher Perception of Automatically Extracted Grammar Concepts for L2
Language Learning [91.49622922938681]
We present an automatic framework that automatically discovers and visualizing descriptions of different aspects of grammar.
Specifically, we extract descriptions from a natural text corpus that answer questions about morphosyntax and semantics.
We apply this method for teaching the Indian languages, Kannada and Marathi, which, unlike English, do not have well-developed pedagogical resources.
arXiv Detail & Related papers (2022-06-10T14:52:22Z) - Automated question generation and question answering from Turkish texts
using text-to-text transformers [1.5749416770494706]
We fine-tune a multilingual T5 (mT5) transformer in a multi-task setting for QA, QG and answer extraction tasks.
This is the first academic work that attempts to perform automated text-to-text question generation from Turkish texts.
arXiv Detail & Related papers (2021-11-11T22:00:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.