AI-Driven Grading and Moderation for Collaborative Projects in Computer Science Education
- URL: http://arxiv.org/abs/2510.03998v1
- Date: Sun, 05 Oct 2025 02:16:52 GMT
- Title: AI-Driven Grading and Moderation for Collaborative Projects in Computer Science Education
- Authors: Songmei Yu, Andrew Zagula,
- Abstract summary: This paper introduces a semi-automated, AI-assisted grading system that evaluates both project quality and individual effort using repository mining, communication analytics, and machine learning models.<n>A pilot deployment in a senior-level course demonstrated high alignment with instructor assessments, increased student satisfaction, and reduced instructor grading effort.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collaborative group projects are integral to computer science education, as they foster teamwork, problem-solving skills, and industry-relevant competencies. However, assessing individual contributions within group settings has long been a challenge. Traditional assessment strategies, such as the equal distribution of grades or subjective peer assessments, often fall short in terms of fairness, objectivity, and scalability, particularly in large classrooms. This paper introduces a semi-automated, AI-assisted grading system that evaluates both project quality and individual effort using repository mining, communication analytics, and machine learning models. The system comprises modules for project evaluation, contribution analysis, and grade computation, integrating seamlessly with platforms like GitHub. A pilot deployment in a senior-level course demonstrated high alignment with instructor assessments, increased student satisfaction, and reduced instructor grading effort. We conclude by discussing implementation considerations, ethical implications, and proposed enhancements to broaden applicability.
Related papers
- Expert Preference-based Evaluation of Automated Related Work Generation [54.29459509574242]
We propose GREP, a multi-turn evaluation framework that integrates classical related work evaluation criteria with expert-specific preferences.<n>For better accessibility, we design two variants of GREP: a more precise variant with proprietary LLMs as evaluators, and a cheaper alternative with open-weight LLMs.
arXiv Detail & Related papers (2025-08-11T13:08:07Z) - Beyond Brainstorming: What Drives High-Quality Scientific Ideas? Lessons from Multi-Agent Collaboration [59.41889496960302]
This paper investigates whether structured multi-agent discussions can surpass solitary ideation.<n>We propose a cooperative multi-agent framework for generating research proposals.<n>We employ a comprehensive protocol with agent-based scoring and human review across dimensions such as novelty, strategic vision, and integration depth.
arXiv Detail & Related papers (2025-08-06T15:59:18Z) - Teaching at Scale: Leveraging AI to Evaluate and Elevate Engineering Education [3.557803321422781]
This article presents a scalable, AI-supported framework for qualitative student feedback using large language models.<n>The system employs hierarchical summarization, anonymization, and exception handling to extract actionable themes from open-ended comments.<n>We report on its successful deployment across a large college of engineering.
arXiv Detail & Related papers (2025-08-01T20:27:40Z) - Lessons from a Big-Bang Integration: Challenges in Edge Computing and Machine Learning [52.86213078016168]
The project faced critical setbacks due to a big-bang integration approach.<n>The study identifies technical and organisational barriers, including poor communication.<n>It also considers psychological factors such as a bias toward fully developed components over mockups.
arXiv Detail & Related papers (2025-07-23T07:16:45Z) - AGACCI : Affiliated Grading Agents for Criteria-Centric Interface in Educational Coding Contexts [0.6050976240234864]
We introduce AGACCI, a multi-agent system that distributes specialized evaluation roles across collaborative agents.<n>AGACCI outperforms a single GPT-based baseline in terms of rubric and feedback accuracy, relevance, consistency, and coherence.
arXiv Detail & Related papers (2025-07-07T15:50:46Z) - Rethinking Machine Unlearning in Image Generation Models [59.697750585491264]
CatIGMU is a novel hierarchical task categorization framework.<n>EvalIGMU is a comprehensive evaluation framework.<n>We construct DataIGM, a high-quality unlearning dataset.
arXiv Detail & Related papers (2025-06-03T11:25:14Z) - Closing the Evaluation Gap: Developing a Behavior-Oriented Framework for Assessing Virtual Teamwork Competency [6.169364905804677]
This study develops a behavior-oriented framework for assessing virtual teamwork competencies among engineering students.<n>Using focus group interviews combined with the Critical Incident Technique, the study identified three key dimensions.<n>The resulting framework provides a foundation for more effective assessment practices.
arXiv Detail & Related papers (2025-04-20T08:12:27Z) - Level Up Peer Review in Education: Investigating genAI-driven Gamification system and its influence on Peer Feedback Effectiveness [0.8087870525861938]
This paper introduces Socratique, a gamified peer-assessment platform integrated with Generative AI (GenAI) assistance.<n>By incorporating game elements, Socratique aims to motivate students to provide more feedback.<n>Students in the treatment group provided significantly more voluntary feedback, with higher scores on clarity, relevance, and specificity.
arXiv Detail & Related papers (2025-04-03T18:30:25Z) - Code Collaborate: Dissecting Team Dynamics in First-Semester Programming Students [3.0294711465150006]
The study highlights the collaboration trends that emerge as first-semester students develop a 2D game project.
Results indicate that students often slightly overestimate their contributions, with more engaged individuals more likely to acknowledge mistakes.
Team performance shows no significant variation based on nationality or gender composition, though teams that disbanded frequently consisted of lone wolves.
arXiv Detail & Related papers (2024-10-28T11:42:05Z) - Towards a Success Model for Automated Programming Assessment Systems
Used as a Formative Assessment Tool [42.03652286907358]
The assessment of source code in university education is a central and important task for lecturers of programming courses.
The use of automated programming assessment systems (APASs) is a promising solution.
Measuring the effectiveness and success of APASs is crucial to understanding how such platforms should be designed, implemented, and used.
arXiv Detail & Related papers (2023-06-08T06:19:15Z) - Investigating Fairness Disparities in Peer Review: A Language Model
Enhanced Approach [77.61131357420201]
We conduct a thorough and rigorous study on fairness disparities in peer review with the help of large language models (LMs)
We collect, assemble, and maintain a comprehensive relational database for the International Conference on Learning Representations (ICLR) conference from 2017 to date.
We postulate and study fairness disparities on multiple protective attributes of interest, including author gender, geography, author, and institutional prestige.
arXiv Detail & Related papers (2022-11-07T16:19:42Z) - Modelling Assessment Rubrics through Bayesian Networks: a Pragmatic Approach [40.06500618820166]
This paper presents an approach to deriving a learner model directly from an assessment rubric.
We illustrate how the approach can be applied to automatize the human assessment of an activity developed for testing computational thinking skills.
arXiv Detail & Related papers (2022-09-07T10:09:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.