Related papers: Scaling Success: A Systematic Review of Peer Grading Strategies for Accuracy, Efficiency, and Learning in Contemporary Education

Scaling Success: A Systematic Review of Peer Grading Strategies for Accuracy, Efficiency, and Learning in Contemporary Education

URL: http://arxiv.org/abs/2508.11677v1
Date: Fri, 08 Aug 2025 15:22:06 GMT
Title: Scaling Success: A Systematic Review of Peer Grading Strategies for Accuracy, Efficiency, and Learning in Contemporary Education
Authors: Uchswas Paul, Ananya Mantravadi, Jash Shah, Shail Shah, Sri Vaishnavi Mylavarapu, M Parvez Rashid, Edward Gehringer,
Abstract summary: This paper presents a systematic review of 122 peer-reviewed studies on peer grading spanning over four decades.<n>We propose a comprehensive taxonomy that organizes peer grading systems along two key dimensions: evaluation approaches and reviewer weighting strategies.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Peer grading has emerged as a scalable solution for assessment in large and online classrooms, offering both logistical efficiency and pedagogical value. However, designing effective peer-grading systems remains challenging due to persistent concerns around accuracy, fairness, reliability, and student engagement. This paper presents a systematic review of 122 peer-reviewed studies on peer grading spanning over four decades. Drawing from this literature, we propose a comprehensive taxonomy that organizes peer grading systems along two key dimensions: (1) evaluation approaches and (2) reviewer weighting strategies. We analyze how different design choices impact grading accuracy, fairness, student workload, and learning outcomes. Our findings highlight the strengths and limitations of each method. Notably, we found that formative feedback -- often regarded as the most valuable aspect of peer assessment -- is seldom incorporated as a quality-based weighting factor in summative grade synthesis techniques. Furthermore, no single reviewer weighting strategy proves universally optimal; each has its trade-offs. Hybrid strategies that combine multiple techniques could show the greatest promise. Our taxonomy offers a practical framework for educators and researchers aiming to design peer grading systems that are accurate, equitable, and pedagogically meaningful.

Related papers

Assessment Twins: A Protocol for AI-Vulnerable Summative Assessment [0.0]
We introduce assessment twins as an accessible approach for redesigning assessment tasks to enhance validity.<n>We use Messick's unified validity framework to systematically map the ways in which GenAI threaten content, structural, consequential, generalisability, and external validity.<n>We argue that the twin approach helps mitigate validity threats by triangulating evidence across complementary formats.
arXiv Detail & Related papers (2025-10-03T12:05:34Z)
Optimizing Peer Grading: A Systematic Literature Review of Reviewer Assignment Strategies and Quantity of Reviewers [0.0]
This paper investigates how reviewer-assignment strategies and the number of reviews per submission impact the accuracy, fairness, and educational value of peer assessment.<n>We identified four common reviewer-assignment strategies: random assignment, competency-based assignment, social-network-based assignment, and bidding.<n>In terms of review count, assigning three reviews per submission emerges as the most common practice.
arXiv Detail & Related papers (2025-08-08T15:28:39Z)
J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning [54.85131761693927]
We introduce J1, a reinforcement learning framework for teaching LLM judges to think before making decisions.<n>Our core contribution lies in converting all judgment tasks for non-verifiable and verifiable prompts into a unified format with verifiable rewards.<n>We then use RL to train thinking-judges at scales of 8B, 32B, and 70B and show that they obtain state-of-the-art performance.
arXiv Detail & Related papers (2025-05-15T14:05:15Z)
A Benchmark for Fairness-Aware Graph Learning [58.515305543487386]
We present an extensive benchmark on ten representative fairness-aware graph learning methods. Our in-depth analysis reveals key insights into the strengths and limitations of existing methods.
arXiv Detail & Related papers (2024-07-16T18:43:43Z)
A Hierarchy-based Analysis Approach for Blended Learning: A Case Study with Chinese Students [12.533646830917213]
This paper proposes a hierarchy-based evaluation approach for blended learning evaluation. The results show that cognitive engagement and emotional engagement play a more important role in blended learning evaluation.
arXiv Detail & Related papers (2023-09-19T00:09:00Z)
Better Understanding Differences in Attribution Methods via Systematic Evaluations [57.35035463793008]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions. We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods. We use these evaluation schemes to study strengths and shortcomings of some widely used attribution methods over a wide range of models.
arXiv Detail & Related papers (2023-03-21T14:24:58Z)
Weighted Ensemble Self-Supervised Learning [67.24482854208783]
Ensembling has proven to be a powerful technique for boosting model performance. We develop a framework that permits data-dependent weighted cross-entropy losses. Our method outperforms both in multiple evaluation metrics on ImageNet-1K.
arXiv Detail & Related papers (2022-11-18T02:00:17Z)
Towards Better Understanding Attribution Methods [77.1487219861185]
Post-hoc attribution methods have been proposed to identify image regions most influential to the models' decisions. We propose three novel evaluation schemes to more reliably measure the faithfulness of those methods. We also propose a post-processing smoothing step that significantly improves the performance of some attribution methods.
arXiv Detail & Related papers (2022-05-20T20:50:17Z)
Improving Peer Assessment with Graph Convolutional Networks [2.105564340986074]
Peer assessment might not be as accurate as expert evaluations, thus rendering these systems unreliable. We first model peer assessment as multi-relational weighted networks that can express a variety of peer assessment setups. We introduce a graph convolutional network which can learn assessment patterns and user behaviors to more accurately predict expert evaluations.
arXiv Detail & Related papers (2021-11-04T03:43:09Z)
Dual Policy Distillation [58.43610940026261]
Policy distillation, which transfers a teacher policy to a student policy, has achieved great success in challenging tasks of deep reinforcement learning. In this work, we introduce dual policy distillation(DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment. The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-06-07T06:49:47Z)
Computing With Words for Student Strategy Evaluation in an Examination [11.468266186093828]
This paper reports a novel Per C based approach for student strategy evaluation. It generates a numeric score for the overall evaluation of strategy adopted by a student in the examination. A linguistic evaluation describing the student strategy is also obtained from the system.
arXiv Detail & Related papers (2020-05-02T15:57:54Z)
Improving Scholarly Knowledge Representation: Evaluating BERT-based Models for Scientific Relation Classification [5.8962650619804755]
We show that domain-specific pre-training corpus benefits the Bert-based classification model to identify type of scientific relations. Although the strategy of predicting a single relation each time achieves a higher classification accuracy, the latter strategy demonstrates a more consistent performance in the corpus with either a large or small size of annotations.
arXiv Detail & Related papers (2020-04-13T18:46:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.