Exploring LLM-Generated Feedback for Economics Essays: How Teaching Assistants Evaluate and Envision Its Use
- URL: http://arxiv.org/abs/2505.15596v1
- Date: Wed, 21 May 2025 14:50:30 GMT
- Title: Exploring LLM-Generated Feedback for Economics Essays: How Teaching Assistants Evaluate and Envision Its Use
- Authors: Xinyi Lu, Aditya Mahesh, Zejia Shen, Mitchell Dudley, Larissa Sano, Xu Wang,
- Abstract summary: This project examines the prospect of using AI-generated feedback as suggestions to expedite and enhance human instructors' feedback provision.<n>We developed a feedback engine that generates feedback on students' essays based on grading rubrics used by the teaching assistants (TAs)<n>We performed think-aloud studies with 5 TAs over 20 1-hour sessions to have them evaluate the AI feedback, contrast the AI feedback with their handwritten feedback, and share how they envision using the AI feedback if they were offered as suggestions.
- Score: 3.345149032274467
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: This project examines the prospect of using AI-generated feedback as suggestions to expedite and enhance human instructors' feedback provision. In particular, we focus on understanding the teaching assistants' perspectives on the quality of AI-generated feedback and how they may or may not utilize AI feedback in their own workflows. We situate our work in a foundational college Economics class, which has frequent short essay assignments. We developed an LLM-powered feedback engine that generates feedback on students' essays based on grading rubrics used by the teaching assistants (TAs). To ensure that TAs can meaningfully critique and engage with the AI feedback, we had them complete their regular grading jobs. For a randomly selected set of essays that they had graded, we used our feedback engine to generate feedback and displayed the feedback as in-text comments in a Word document. We then performed think-aloud studies with 5 TAs over 20 1-hour sessions to have them evaluate the AI feedback, contrast the AI feedback with their handwritten feedback, and share how they envision using the AI feedback if they were offered as suggestions. The study highlights the importance of providing detailed rubrics for AI to generate high-quality feedback for knowledge-intensive essays. TAs considered that using AI feedback as suggestions during their grading could expedite grading, enhance consistency, and improve overall feedback quality. We discuss the importance of decomposing the feedback generation task into steps and presenting intermediate results, in order for TAs to use the AI feedback.
Related papers
- AI-Mediated Feedback Improves Student Revisions: A Randomized Trial with FeedbackWriter in a Large Undergraduate Course [5.829384802817518]
We introduce and deploy FeedbackWriter, a system that generates AI suggestions to teaching assistants (TAs) while they provide feedback on students' knowledge-intensive essays.<n>Students were randomly assigned to receive either handwritten feedback from TAs or AI-mediated feedback where TAs received suggestions from FeedbackWriter.<n>We found that students receiving AI-mediated feedback produced significantly higher-quality revisions, with gains increasing as TAs adopted more AI suggestions.
arXiv Detail & Related papers (2026-02-18T19:33:56Z) - What happens when reviewers receive AI feedback in their reviews? [9.57486570505445]
Advocates see AI's potential to reduce reviewer burden and improve quality, while critics warn of risks to fairness, accountability, and trust.<n>At ICLR 2025, an official AI feedback tool was deployed to provide reviewers with post-review suggestions.<n>This work contributes the first empirical evidence of such an AI tool in a live review process.
arXiv Detail & Related papers (2026-02-14T15:22:33Z) - Exposía: Academic Writing Assessment of Exposés and Peer Feedback [56.428320613219306]
We present Exposa, the first public dataset that connects writing and feedback assessment in higher education.<n>We use Exposa to benchmark state-of-the-art open-source large language models (LLMs) for two tasks: automated scoring of (1) the proposals and (2) the student reviews.
arXiv Detail & Related papers (2026-01-10T11:33:26Z) - Calibrated Generative AI as Meta-Reviewer: A Systemic Functional Linguistics Discourse Analysis of Reviews of Peer Reviews [0.07999703756441755]
generative AI can approximate key rhetorical and relational features of effective human feedback.<n>generative AI metafeedback has the potential to scaffold feedback literacy and enhance leaner engagement with peer review.
arXiv Detail & Related papers (2025-09-18T15:00:44Z) - Evaluating Trust in AI, Human, and Co-produced Feedback Among Undergraduate Students [2.935250567679577]
Students generally preferred AI and co-produced feedback over human feedback in terms of perceived usefulness and objectivity.<n>Male students consistently rated all feedback types as less valuable than their female and non-binary counterparts.<n>These insights inform evidence-based guidelines for integrating AI into higher education feedback systems.
arXiv Detail & Related papers (2025-04-15T08:06:36Z) - Supporting Students' Reading and Cognition with AI [12.029238454394445]
We analyzed text from 124 sessions with AI tools to understand users' reading processes and cognitive engagement.<n>We propose design implications for future AI reading-support systems, including structured scaffolds for lower-level cognitive tasks.<n>We advocate for adaptive, human-in-the-loop features that allow students and instructors to tailor their reading experiences with AI.
arXiv Detail & Related papers (2025-04-07T17:51:27Z) - On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows [71.92083784393418]
Agentic AI (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low.<n>Inference-time alignment relies on three components: sampling, evaluation, and feedback.<n>We introduce Iterative Agent Decoding (IAD), a procedure that repeatedly inserts feedback extracted from different forms of critiques.
arXiv Detail & Related papers (2025-04-02T17:40:47Z) - Understanding and Supporting Peer Review Using AI-reframed Positive Summary [18.686807993563168]
This study explored the impact of appending an automatically generated positive summary to the peer reviews of a writing task.<n>We found that adding an AI-reframed positive summary to otherwise harsh feedback increased authors' critique acceptance.<n>We discuss the implications of using AI in peer feedback, focusing on how it can influence critique acceptance and support research communities.
arXiv Detail & Related papers (2025-03-13T11:22:12Z) - Exploring LLM Prompting Strategies for Joint Essay Scoring and Feedback Generation [13.854903594424876]
Large language models (LLMs) have demonstrated strong performance in generating coherent and contextually relevant text.
This work explores several prompting strategies for LLM-based zero-shot and few-shot generation of essay feedback.
Inspired by Chain-of-Thought prompting, we study how and to what extent automated essay scoring (AES) can benefit the quality of generated feedback.
arXiv Detail & Related papers (2024-04-24T12:48:06Z) - Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs [57.16442740983528]
In ad-hoc retrieval, evaluation relies heavily on user actions, including implicit feedback.
The role of user feedback in annotators' assessment of turns in a conversational perception has been little studied.
We focus on how the evaluation of task-oriented dialogue systems ( TDSs) is affected by considering user feedback, explicit or implicit, as provided through the follow-up utterance of a turn being evaluated.
arXiv Detail & Related papers (2024-04-19T16:45:50Z) - Improving the Validity of Automatically Generated Feedback via Reinforcement Learning [46.667783153759636]
We propose a framework for feedback generation that optimize both correctness and alignment using reinforcement learning (RL)<n>Specifically, we use GPT-4's annotations to create preferences over feedback pairs in an augmented dataset for training via direct preference optimization (DPO)
arXiv Detail & Related papers (2024-03-02T20:25:50Z) - UltraFeedback: Boosting Language Models with Scaled AI Feedback [99.4633351133207]
We present textscUltraFeedback, a large-scale, high-quality, and diversified AI feedback dataset.
Our work validates the effectiveness of scaled AI feedback data in constructing strong open-source chat language models.
arXiv Detail & Related papers (2023-10-02T17:40:01Z) - PapagAI:Automated Feedback for Reflective Essays [48.4434976446053]
We present the first open-source automated feedback tool based on didactic theory and implemented as a hybrid AI system.
The main objective of our work is to enable better learning outcomes for students and to complement the teaching activities of lecturers.
arXiv Detail & Related papers (2023-07-10T11:05:51Z) - ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification.
A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors.
Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z) - Effects of Human vs. Automatic Feedback on Students' Understanding of AI
Concepts and Programming Style [0.0]
The use of automatic grading tools has become nearly ubiquitous in large undergraduate programming courses.
There is a relative lack of data directly comparing student outcomes when receiving computer-generated feedback and human-written feedback.
This paper addresses this gap by splitting one 90-student class into two feedback groups and analyzing differences in the two cohorts' performance.
arXiv Detail & Related papers (2020-11-20T21:40:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.