Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank
- URL: http://arxiv.org/abs/2405.05144v2
- Date: Mon, 13 May 2024 18:10:19 GMT
- Title: Improving Automated Distractor Generation for Math Multiple-choice Questions with Overgenerate-and-rank
- Authors: Alexander Scarlatos, Wanyong Feng, Digory Smith, Simon Woodhead, Andrew Lan,
- Abstract summary: We propose a novel method to enhance the quality of generated distractors through overgenerate-and-rank.
Our ranking model increases alignment with human-authored distractors, although human-authored ones are still preferred over generated ones.
- Score: 44.04217284677347
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Multiple-choice questions (MCQs) are commonly used across all levels of math education since they can be deployed and graded at a large scale. A critical component of MCQs is the distractors, i.e., incorrect answers crafted to reflect student errors or misconceptions. Automatically generating them in math MCQs, e.g., with large language models, has been challenging. In this work, we propose a novel method to enhance the quality of generated distractors through overgenerate-and-rank, training a ranking model to predict how likely distractors are to be selected by real students. Experimental results on a real-world dataset and human evaluation with math teachers show that our ranking model increases alignment with human-authored distractors, although human-authored ones are still preferred over generated ones.
Related papers
- DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions [42.148511874019256]
We introduce DiVERT, a novel variational approach that learns an interpretable representation of errors behind distractors in math multiple-choice questions (MCQs)
We show that DiVERT, despite using a base open-source LLM with 7B parameters, outperforms state-of-the-art approaches using GPT-4o on downstream distractor generation.
We also conduct a human evaluation with math educators and find that DiVERT leads to error labels that are of comparable quality to human-authored ones.
arXiv Detail & Related papers (2024-06-27T17:37:31Z) - Math Multiple Choice Question Generation via Human-Large Language Model Collaboration [5.081508251092439]
Multiple choice questions (MCQs) are a popular method for evaluating students' knowledge.
Recent advances in large language models (LLMs) have sparked interest in automating MCQ creation.
This paper introduces a prototype tool designed to facilitate collaboration between LLMs and educators.
arXiv Detail & Related papers (2024-05-01T20:53:13Z) - Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models [40.50115385623107]
Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and reliable format in assessments and practices.
One of the most important aspects of MCQs is the distractors, i.e., incorrect options that are designed to target common errors or misconceptions among real students.
To date, the task of crafting high-quality distractors largely remains a labor and time-intensive process for teachers and learning content designers, which has limited scalability.
arXiv Detail & Related papers (2024-04-02T17:31:58Z) - Automated Distractor and Feedback Generation for Math Multiple-choice
Questions via In-context Learning [43.83422798569986]
Multiple-choice questions (MCQs) are ubiquitous in almost all levels of education since they are easy to administer, grade, and reliable form of assessment.
To date, the task of crafting high-quality distractors has largely remained a labor-intensive process for teachers and learning content designers.
We propose a simple, in-context learning-based solution for automated distractor and corresponding feedback message generation.
arXiv Detail & Related papers (2023-08-07T01:03:04Z) - MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties
Grounded in Math Reasoning Problems [74.73881579517055]
We propose a framework to generate such dialogues by pairing human teachers with a Large Language Model prompted to represent common student errors.
We describe how we use this framework to collect MathDial, a dataset of 3k one-to-one teacher-student tutoring dialogues.
arXiv Detail & Related papers (2023-05-23T21:44:56Z) - Learning to Reuse Distractors to support Multiple Choice Question
Generation in Education [19.408786425460498]
This paper studies how a large existing set of manually created answers and distractors can be leveraged to help teachers in creating new multiple choice questions (MCQs)
We built several data-driven models based on context-aware question and distractor representations, and compared them with static feature-based models.
Both automatic and human evaluations indicate that context-aware models consistently outperform a static feature-based approach.
arXiv Detail & Related papers (2022-10-25T12:48:56Z) - The World is Not Binary: Learning to Rank with Grayscale Data for
Dialogue Response Selection [55.390442067381755]
We show that grayscale data can be automatically constructed without human effort.
Our method employs off-the-shelf response retrieval models and response generation models as automatic grayscale data generators.
Experiments on three benchmark datasets and four state-of-the-art matching models show that the proposed approach brings significant and consistent performance improvements.
arXiv Detail & Related papers (2020-04-06T06:34:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.