"I understand why I got this grade": Automatic Short Answer Grading with Feedback
- URL: http://arxiv.org/abs/2407.12818v1
- Date: Sun, 30 Jun 2024 15:42:18 GMT
- Title: "I understand why I got this grade": Automatic Short Answer Grading with Feedback
- Authors: Dishank Aggarwal, Pushpak Bhattacharyya, Bhaskaran Raman,
- Abstract summary: We present a dataset of 5.8k student answers accompanied by reference answers and questions for the Automatic Short Answer Grading (ASAG) task.
The EngSAF dataset is meticulously curated to cover a diverse range of subjects, questions, and answer patterns from multiple engineering domains.
- Score: 36.74896284581596
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The demand for efficient and accurate assessment methods has intensified as education systems transition to digital platforms. Providing feedback is essential in educational settings and goes beyond simply conveying marks as it justifies the assigned marks. In this context, we present a significant advancement in automated grading by introducing Engineering Short Answer Feedback (EngSAF) -- a dataset of 5.8k student answers accompanied by reference answers and questions for the Automatic Short Answer Grading (ASAG) task. The EngSAF dataset is meticulously curated to cover a diverse range of subjects, questions, and answer patterns from multiple engineering domains. We leverage state-of-the-art large language models' (LLMs) generative capabilities with our Label-Aware Synthetic Feedback Generation (LASFG) strategy to include feedback in our dataset. This paper underscores the importance of enhanced feedback in practical educational settings, outlines dataset annotation and feedback generation processes, conducts a thorough EngSAF analysis, and provides different LLMs-based zero-shot and finetuned baselines for future comparison. Additionally, we demonstrate the efficiency and effectiveness of the ASAG system through its deployment in a real-world end-semester exam at the Indian Institute of Technology Bombay (IITB), showcasing its practical viability and potential for broader implementation in educational institutions.
Related papers
- Beyond Scores: A Modular RAG-Based System for Automatic Short Answer Scoring with Feedback [3.2734777984053887]
We propose a modular retrieval augmented generation based ASAS-F system that scores answers and generates feedback in strict zero-shot and few-shot learning scenarios.
Results show an improvement in scoring accuracy by 9% on unseen questions compared to fine-tuning, offering a scalable and cost-effective solution.
arXiv Detail & Related papers (2024-09-30T07:48:55Z) - Grade Like a Human: Rethinking Automated Assessment with Large Language Models [11.442433408767583]
Large language models (LLMs) have been used for automated grading, but they have not yet achieved the same level of performance as humans.
We propose an LLM-based grading system that addresses the entire grading procedure, including the following key components.
arXiv Detail & Related papers (2024-05-30T05:08:15Z) - Automated Long Answer Grading with RiceChem Dataset [19.34390869143846]
We introduce a new area of study in the field of educational Natural Language Processing: Automated Long Answer Grading (ALAG)
ALAG presents unique challenges due to the complexity and multifaceted nature of fact-based long answers.
We propose a novel approach to ALAG by formulating it as a rubric entailment problem, employing natural language inference models.
arXiv Detail & Related papers (2024-04-22T16:28:09Z) - Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs)
The system is into three inter-connected core processes-interaction, reflection, and reaction.
Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z) - Towards LLM-based Autograding for Short Textual Answers [4.853810201626855]
This manuscript is an evaluation of a large language model for the purpose of autograding.
Our findings suggest that while "out-of-the-box" LLMs provide a valuable tool, their readiness for independent automated grading remains a work in progress.
arXiv Detail & Related papers (2023-09-09T22:25:56Z) - Instruction Tuning for Large Language Models: A Survey [52.86322823501338]
We make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications.
We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research.
arXiv Detail & Related papers (2023-08-21T15:35:16Z) - Survey on Automated Short Answer Grading with Deep Learning: from Word
Embeddings to Transformers [5.968260239320591]
Automated short answer grading (ASAG) has gained attention in education as a means to scale educational tasks to the growing number of students.
Recent progress in Natural Language Processing and Machine Learning has largely influenced the field of ASAG.
arXiv Detail & Related papers (2022-03-11T13:47:08Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z) - Get It Scored Using AutoSAS -- An Automated System for Scoring Short
Answers [63.835172924290326]
We present a fast, scalable, and accurate approach towards automated Short Answer Scoring (SAS)
We propose and explain the design and development of a system for SAS, namely AutoSAS.
AutoSAS shows state-of-the-art performance and achieves better results by over 8% in some of the question prompts.
arXiv Detail & Related papers (2020-12-21T10:47:30Z) - DAGA: Data Augmentation with a Generation Approach for Low-resource
Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences.
Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.