Related papers: Autograder+: A Multi-Faceted AI Framework for Rich Pedagogical Feedback in Programming Education

Autograder+: A Multi-Faceted AI Framework for Rich Pedagogical Feedback in Programming Education

URL: http://arxiv.org/abs/2510.26402v1
Date: Thu, 30 Oct 2025 11:41:50 GMT
Title: Autograder+: A Multi-Faceted AI Framework for Rich Pedagogical Feedback in Programming Education
Authors: Vikrant Sahu, Gagan Raj Gupta, Raghav Borikar, Nitin Mane,
Abstract summary: Autograder+ is designed to shift autograding from a purely summative process to a formative learning experience.<n>It introduces two key capabilities: automated feedback generation using a fine-tuned Large Language Model, and visualization of student code submissions to uncover learning patterns.
Score: 0.5529795221640363
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The rapid growth of programming education has outpaced traditional assessment tools, leaving faculty with limited means to provide meaningful, scalable feedback. Conventional autograders, while efficient, act as black-box systems that simply return pass/fail results, offering little insight into student thinking or learning needs. Autograder+ is designed to shift autograding from a purely summative process to a formative learning experience. It introduces two key capabilities: automated feedback generation using a fine-tuned Large Language Model, and visualization of student code submissions to uncover learning patterns. The model is fine-tuned on curated student code and expert feedback to ensure pedagogically aligned, context-aware guidance. In evaluation across 600 student submissions from multiple programming tasks, the system produced feedback with strong semantic alignment to instructor comments. For visualization, contrastively learned code embeddings trained on 1,000 annotated submissions enable grouping solutions into meaningful clusters based on functionality and approach. The system also supports prompt-pooling, allowing instructors to guide feedback style through selected prompt templates. By integrating AI-driven feedback, semantic clustering, and interactive visualization, Autograder+ reduces instructor workload while supporting targeted instruction and promoting stronger learning outcomes.

Related papers

Scaling Equitable Reflection Assessment in Education via Large Language Models and Role-Based Feedback Agents [2.825140278227664]
Formative feedback is one of the most effective drivers of student learning.<n>In large or low-resource courses, instructors often lack the time, staffing, and bandwidth required to review and respond to every student reflection.<n>This paper presents a theory-grounded system that uses five coordinated role-based LLM agents to score learner reflections.
arXiv Detail & Related papers (2025-11-14T09:46:21Z)
Stitch: Step-by-step LLM Guided Tutoring for Scratch [1.8206350996077172]
We present Stitch, an interactive tutoring system that replaces "showing the answer" with step-by-step scaffolding.<n>We evaluate Stitch in an empirical study, comparing it against a state-of-the-art automated feedback generation tool for Scratch.
arXiv Detail & Related papers (2025-10-30T16:03:56Z)
Humanizing Automated Programming Feedback: Fine-Tuning Generative Models with Student-Written Feedback [21.114005575615586]
We explore learnersourcing as a means to fine-tune language models for generating feedback that is more similar to that written by humans.<n>We collected approximately 1,900 instances of student-written feedback on multiple programming problems and buggy programs.<n>Our findings indicate that fine-tuning models on learnersourced data not only produces feedback that better matches the style of feedback written by students, but also improves accuracy compared to feedback generated through prompt engineering alone.
arXiv Detail & Related papers (2025-09-12T19:23:05Z)
MathTutorBench: A Benchmark for Measuring Open-ended Pedagogical Capabilities of LLM Tutors [82.91830877219822]
We present MathTutorBench, an open-source benchmark for holistic tutoring model evaluation.<n>MathTutorBench contains datasets and metrics that broadly cover tutor abilities as defined by learning sciences research in dialog-based teaching.<n>We evaluate a wide set of closed- and open-weight models and find that subject expertise, indicated by solving ability, does not immediately translate to good teaching.
arXiv Detail & Related papers (2025-02-26T08:43:47Z)
Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [46.667783153759636]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)<n>Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z)
YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework. It emulates the teacher-student education process to improve the efficacy of model fine-tuning. Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z)
Empowering Private Tutoring by Chaining Large Language Models [87.76985829144834]
This work explores the development of a full-fledged intelligent tutoring system powered by state-of-the-art large language models (LLMs) The system is into three inter-connected core processes-interaction, reflection, and reaction. Each process is implemented by chaining LLM-powered tools along with dynamically updated memory modules.
arXiv Detail & Related papers (2023-09-15T02:42:03Z)
A large language model-assisted education tool to provide feedback on open-ended responses [2.624902795082451]
We present a tool that uses large language models (LLMs), guided by instructor-defined criteria, to automate responses to open-ended questions. Our tool delivers rapid personalized feedback, enabling students to quickly test their knowledge and identify areas for improvement.
arXiv Detail & Related papers (2023-07-25T19:49:55Z)
PapagAI:Automated Feedback for Reflective Essays [48.4434976446053]
We present the first open-source automated feedback tool based on didactic theory and implemented as a hybrid AI system. The main objective of our work is to enable better learning outcomes for students and to complement the teaching activities of lecturers.
arXiv Detail & Related papers (2023-07-10T11:05:51Z)
Giving Feedback on Interactive Student Programs with Meta-Exploration [74.5597783609281]
Developing interactive software, such as websites or games, is a particularly engaging way to learn computer science. Standard approaches require instructors to manually grade student-implemented interactive programs. Online platforms that serve millions, like Code.org, are unable to provide any feedback on assignments for implementing interactive programs.
arXiv Detail & Related papers (2022-11-16T10:00:23Z)
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification. A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors. Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
Deep Discourse Analysis for Generating Personalized Feedback in Intelligent Tutor Systems [4.716555240531893]
We explore creating automated, personalized feedback in an intelligent tutoring system (ITS) Our goal is to pinpoint correct and incorrect concepts in student answers in order to achieve better student learning gains.
arXiv Detail & Related papers (2021-03-13T20:33:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.