AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course
- URL: http://arxiv.org/abs/2602.16820v2
- Date: Tue, 24 Feb 2026 03:52:57 GMT
- Title: AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course
- Authors: Xinyi Lu, Kexin Phyllis Ju, Mitchell Dudley, Larissa Sano, Xu Wang,
- Abstract summary: We introduce and deploy FeedbackWriter, a system that generates AI suggestions to teaching assistants (TAs) while they provide feedback on students' knowledge-intensive essays.<n>Students were randomly assigned to receive either handwritten feedback from TAs or AI-mediated feedback where TAs received suggestions from FeedbackWriter.<n>We found that students receiving AI-mediated feedback produced significantly higher-quality revisions, with gains increasing as TAs adopted more AI suggestions.
- Score: 5.829384802817518
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite growing interest in using LLMs to generate feedback on students' writing, little is known about how students respond to AI-mediated versus human-provided feedback. We address this gap through a randomized controlled trial in a large introductory economics course (N=354), where we introduce and deploy FeedbackWriter - a system that generates AI suggestions to teaching assistants (TAs) while they provide feedback on students' knowledge-intensive essays. TAs have the full capacity to adopt, edit, or dismiss the suggestions. Students were randomly assigned to receive either handwritten feedback from TAs (baseline) or AI-mediated feedback where TAs received suggestions from FeedbackWriter. Students revise their drafts based on the feedback, which is further graded. In total, 1,366 essays were graded using the system. We found that students receiving AI-mediated feedback produced significantly higher-quality revisions, with gains increasing as TAs adopted more AI suggestions. TAs found the AI suggestions useful for spotting gaps and clarifying rubrics.
Related papers
- LLM-based Multimodal Feedback Produces Equivalent Learning and Better Student Perceptions than Educator Feedback [4.225232488376583]
This study introduces a real-time AI-facilitated multimodal feedback system that integrates structured textual explanations with dynamic multimedia resources.<n>In an online crowdsourcing experiment, we compared this system against fixed business-as-usual feedback by educators across three dimensions.<n>Results showed that AI multimodal feedback achieved learning gains equivalent to original educator feedback while significantly outperforming it on perceived clarity, specificity, conciseness, motivation, satisfaction, and reducing cognitive load.
arXiv Detail & Related papers (2026-01-21T18:58:08Z) - Exposía: Academic Writing Assessment of Exposés and Peer Feedback [56.428320613219306]
We present Exposa, the first public dataset that connects writing and feedback assessment in higher education.<n>We use Exposa to benchmark state-of-the-art open-source large language models (LLMs) for two tasks: automated scoring of (1) the proposals and (2) the student reviews.
arXiv Detail & Related papers (2026-01-10T11:33:26Z) - Humanizing Automated Programming Feedback: Fine-Tuning Generative Models with Student-Written Feedback [21.114005575615586]
We explore learnersourcing as a means to fine-tune language models for generating feedback that is more similar to that written by humans.<n>We collected approximately 1,900 instances of student-written feedback on multiple programming problems and buggy programs.<n>Our findings indicate that fine-tuning models on learnersourced data not only produces feedback that better matches the style of feedback written by students, but also improves accuracy compared to feedback generated through prompt engineering alone.
arXiv Detail & Related papers (2025-09-12T19:23:05Z) - Exploring LLM-Generated Feedback for Economics Essays: How Teaching Assistants Evaluate and Envision Its Use [3.345149032274467]
This project examines the prospect of using AI-generated feedback as suggestions to expedite and enhance human instructors' feedback provision.<n>We developed a feedback engine that generates feedback on students' essays based on grading rubrics used by the teaching assistants (TAs)<n>We performed think-aloud studies with 5 TAs over 20 1-hour sessions to have them evaluate the AI feedback, contrast the AI feedback with their handwritten feedback, and share how they envision using the AI feedback if they were offered as suggestions.
arXiv Detail & Related papers (2025-05-21T14:50:30Z) - Can Automated Feedback Turn Students into Happy Prologians? [0.9087641068861047]
Students found all implemented feedback types helpful, with automatic testing ranked as the most useful.<n>We introduce a dataset comprising 7201 correct and incorrect Prolog submissions, along with 200 manually annotated programs labeled with bug types and corresponding corrections.
arXiv Detail & Related papers (2025-04-23T14:11:54Z) - Evaluating Trust in AI, Human, and Co-produced Feedback Among Undergraduate Students [2.935250567679577]
This study compares undergraduate students' trust in large language models (LLMs), human, and human-AI co-produced feedback in their authentic HE context.<n>Findings revealed students preferred AI and co-produced feedback over human feedback regarding perceived usefulness and objectivity.<n>Educational AI experience improved students' ability to identify LLM-generated feedback and increased their trust in all types of feedback.
arXiv Detail & Related papers (2025-04-15T08:06:36Z) - Understanding and Supporting Peer Review Using AI-reframed Positive Summary [18.686807993563168]
This study explored the impact of appending an automatically generated positive summary to the peer reviews of a writing task.<n>We found that adding an AI-reframed positive summary to otherwise harsh feedback increased authors' critique acceptance.<n>We discuss the implications of using AI in peer feedback, focusing on how it can influence critique acceptance and support research communities.
arXiv Detail & Related papers (2025-03-13T11:22:12Z) - Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [46.667783153759636]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)<n>Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - Simulating Bandit Learning from User Feedback for Extractive Question
Answering [51.97943858898579]
We study learning from user feedback for extractive question answering by simulating feedback using supervised data.
We show that systems initially trained on a small number of examples can dramatically improve given feedback from users on model-predicted answers.
arXiv Detail & Related papers (2022-03-18T17:47:58Z) - Learning Robust Recommender from Noisy Implicit Feedback [140.7090392887355]
We propose a new training strategy named Adaptive Denoising Training (ADT)
ADT adaptively prunes the noisy interactions by two paradigms (i.e., Truncated Loss and Reweighted Loss)
We consider extra feedback (e.g., rating) as auxiliary signal and propose three strategies to incorporate extra feedback into ADT.
arXiv Detail & Related papers (2021-12-02T12:12:02Z) - Learning Opinion Summarizers by Selecting Informative Reviews [81.47506952645564]
We collect a large dataset of summaries paired with user reviews for over 31,000 products, enabling supervised training.
The content of many reviews is not reflected in the human-written summaries, and, thus, the summarizer trained on random review subsets hallucinates.
We formulate the task as jointly learning to select informative subsets of reviews and summarizing the opinions expressed in these subsets.
arXiv Detail & Related papers (2021-09-09T15:01:43Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.