Related papers: From Coders to Critics: Empowering Students through Peer Assessment in the Age of AI Copilots

From Coders to Critics: Empowering Students through Peer Assessment in the Age of AI Copilots

URL: http://arxiv.org/abs/2505.22093v1
Date: Wed, 28 May 2025 08:17:05 GMT
Title: From Coders to Critics: Empowering Students through Peer Assessment in the Age of AI Copilots
Authors: Santiago Berrezueta-Guzman, Stephan Krusche, Stefan Wagner,
Abstract summary: This paper presents an empirical study of a rubric based, anonymized peer review process implemented in a large programming course.<n>Students evaluated each other's final projects (2D game) and their assessments were compared to instructor grades using correlation, mean absolute error, and root mean square error (RMSE)<n>Results show that peer review can approximate instructor evaluation with moderate accuracy and foster student engagement, evaluative thinking, and interest in providing good feedback to their peers.
Score: 3.3094795918443634
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The rapid adoption of AI powered coding assistants like ChatGPT and other coding copilots is transforming programming education, raising questions about assessment practices, academic integrity, and skill development. As educators seek alternatives to traditional grading methods susceptible to AI enabled plagiarism, structured peer assessment could be a promising strategy. This paper presents an empirical study of a rubric based, anonymized peer review process implemented in a large introductory programming course. Students evaluated each other's final projects (2D game), and their assessments were compared to instructor grades using correlation, mean absolute error, and root mean square error (RMSE). Additionally, reflective surveys from 47 teams captured student perceptions of fairness, grading behavior, and preferences regarding grade aggregation. Results show that peer review can approximate instructor evaluation with moderate accuracy and foster student engagement, evaluative thinking, and interest in providing good feedback to their peers. We discuss these findings for designing scalable, trustworthy peer assessment systems to face the age of AI assisted coding.

Related papers

Beyond Agreement: Rethinking Ground Truth in Educational AI Annotation [1.8434042562191815]
We argue that overreliance on human inter-rater reliability (IRR) as a gatekeeper for annotation quality hampers progress in classifying data.<n>We highlight five examples of complementary evaluation methods, such as multi-label annotation schemes, expert-based approaches, and close-the-loop validity.<n>We call on the field to rethink annotation quality and ground truth--prioritizing validity and educational impact over consensus alone.
arXiv Detail & Related papers (2025-07-31T20:05:26Z)
The AI Imperative: Scaling High-Quality Peer Review in Machine Learning [49.87236114682497]
We argue that AI-assisted peer review must become an urgent research and infrastructure priority.<n>We propose specific roles for AI in enhancing factual verification, guiding reviewer performance, assisting authors in quality improvement, and supporting ACs in decision-making.
arXiv Detail & Related papers (2025-06-09T18:37:14Z)
Identifying Aspects in Peer Reviews [61.374437855024844]
We develop a data-driven schema for deriving fine-grained aspects from a corpus of peer reviews.<n>We introduce a dataset of peer reviews augmented with aspects and show how it can be used for community-level review analysis.
arXiv Detail & Related papers (2025-04-09T14:14:42Z)
Level Up Peer Review in Education: Investigating genAI-driven Gamification system and its influence on Peer Feedback Effectiveness [0.8087870525861938]
This paper introduces Socratique, a gamified peer-assessment platform integrated with Generative AI (GenAI) assistance.<n>By incorporating game elements, Socratique aims to motivate students to provide more feedback.<n>Students in the treatment group provided significantly more voluntary feedback, with higher scores on clarity, relevance, and specificity.
arXiv Detail & Related papers (2025-04-03T18:30:25Z)
Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence [6.638206014723678]
Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor.
arXiv Detail & Related papers (2024-01-29T12:11:30Z)
Student Mastery or AI Deception? Analyzing ChatGPT's Assessment Proficiency and Evaluating Detection Strategies [1.633179643849375]
Generative AI systems such as ChatGPT have a disruptive effect on learning and assessment. This work investigates the performance of ChatGPT by evaluating it across three courses.
arXiv Detail & Related papers (2023-11-27T20:10:13Z)
Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach [40.06500618820166]
This paper presents an approach to deriving a learner model directly from an assessment rubric. We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills.
arXiv Detail & Related papers (2022-09-07T10:09:12Z)
Ranking Scientific Papers Using Preference Learning [48.78161994501516]
We cast it as a paper ranking problem based on peer review texts and reviewer scores. We introduce a novel, multi-faceted generic evaluation framework for making final decisions based on peer reviews.
arXiv Detail & Related papers (2021-09-02T19:41:47Z)
ProtoTransformer: A Meta-Learning Approach to Providing Student Feedback [54.142719510638614]
In this paper, we frame the problem of providing feedback as few-shot classification. A meta-learner adapts to give feedback to student code on a new programming question from just a few examples by instructors. Our approach was successfully deployed to deliver feedback to 16,000 student exam-solutions in a programming course offered by a tier 1 university.
arXiv Detail & Related papers (2021-07-23T22:41:28Z)
Effects of Human vs. Automatic Feedback on Students' Understanding of AI Concepts and Programming Style [0.0]
The use of automatic grading tools has become nearly ubiquitous in large undergraduate programming courses. There is a relative lack of data directly comparing student outcomes when receiving computer-generated feedback and human-written feedback. This paper addresses this gap by splitting one 90-student class into two feedback groups and analyzing differences in the two cohorts' performance.
arXiv Detail & Related papers (2020-11-20T21:40:32Z)
Hierarchical Bi-Directional Self-Attention Networks for Paper Review Rating Recommendation [81.55533657694016]
We propose a Hierarchical bi-directional self-attention Network framework (HabNet) for paper review rating prediction and recommendation. Specifically, we leverage the hierarchical structure of the paper reviews with three levels of encoders: sentence encoder (level one), intra-review encoder (level two) and inter-review encoder (level three) We are able to identify useful predictors to make the final acceptance decision, as well as to help discover the inconsistency between numerical review ratings and text sentiment conveyed by reviewers.
arXiv Detail & Related papers (2020-11-02T08:07:50Z)
Code Review in the Classroom [57.300604527924015]
Young developers in a classroom setting provide a clear picture of the potential favourable and problematic areas of the code review process. Their feedback suggests that the process has been well received with some points to better the process. This paper can be used as guidelines to perform code reviews in the classroom.
arXiv Detail & Related papers (2020-04-19T06:07:45Z)
Leveraging Peer Feedback to Improve Visualization Education [4.679788938455095]
We discuss the construction and application of peer review in a computer science visualization course. We evaluate student projects, peer review text, and a post-course questionnaire from 3 semesters of mixed undergraduate and graduate courses.
arXiv Detail & Related papers (2020-01-12T21:46:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.