Related papers: Predicting student performance using data from an auto-grading system

Predicting student performance using data from an auto-grading system

URL: http://arxiv.org/abs/2102.01270v1
Date: Tue, 2 Feb 2021 03:02:39 GMT
Title: Predicting student performance using data from an auto-grading system
Authors: Huanyi Chen, Paul A.S. Ward
Abstract summary: We build decision-tree and linear-regression models with various features extracted from the Marmoset auto-grading system. We show that the linear-regression model using submission time intervals performs the best among all models in terms of Precision and F-Measure. We also show that for students who are misclassified into poor-performance students, they have the lowest actual grades in the linear-regression model among all models.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As online auto-grading systems appear, information obtained from those systems can potentially enable researchers to create predictive models to predict student behaviour and performances. In the University of Waterloo, the ECE 150 (Fundamentals of Programming) Instructional Team wants to get an insight into how to allocate the limited teaching resources better to achieve improved educational outcomes. Currently, the Instructional Team allocates tutoring time in a reactive basis. They help students "as-requested". This approach serves those students with the wherewithal to request help; however, many of the students who are struggling do not reach out for assistance. Therefore, we, as the Research Team, want to explore if we can determine students which need help by looking into the data from our auto-grading system, Marmoset. In this paper, we conducted experiments building decision-tree and linear-regression models with various features extracted from the Marmoset auto-grading system, including passing rate, testcase outcomes, number of submissions and submission time intervals (the time interval between the student's first reasonable submission and the deadline). For each feature, we interpreted the result at the confusion matrix level. Specifically for poor-performance students, we show that the linear-regression model using submission time intervals performs the best among all models in terms of Precision and F-Measure. We also show that for students who are misclassified into poor-performance students, they have the lowest actual grades in the linear-regression model among all models. In addition, we show that for the midterm, the submission time interval of the last assignment before the midterm predicts the midterm performance the most. However, for the final exam, the midterm performance contributes the most on the final exam performance.

Related papers

Early Detection of At-Risk Students Using Machine Learning [0.0]
We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students. This work considers several machine learning models, including Support Vector Machines (SVM), Naive Bayes, K-nearest neighbors (KNN), Decision Trees, Logistic Regression, and Random Forest. Our analysis indicates that all algorithms generate an acceptable outcome for at-risk student predictions, while Naive Bayes performs best overall.
arXiv Detail & Related papers (2024-12-12T17:33:06Z)
Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest. Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z)
A Survey of Deep Graph Learning under Distribution Shifts: from Graph Out-of-Distribution Generalization to Adaptation [59.14165404728197]
We provide an up-to-date and forward-looking review of deep graph learning under distribution shifts. Specifically, we cover three primary scenarios: graph OOD generalization, training-time graph OOD adaptation, and test-time graph OOD adaptation. To provide a better understanding of the literature, we systematically categorize the existing models based on our proposed taxonomy.
arXiv Detail & Related papers (2024-10-25T02:39:56Z)
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models [62.5501109475725]
Knowledge distillation (KD) is a technique that compresses large teacher models by training smaller student models to mimic them. This paper introduces Online Knowledge Distillation (OKD), where the teacher network integrates small online modules to concurrently train with the student model. OKD achieves or exceeds the performance of leading methods in various model architectures and sizes, reducing training time by up to fourfold.
arXiv Detail & Related papers (2024-09-19T07:05:26Z)
Beyond human subjectivity and error: a novel AI grading system [67.410870290301]
The grading of open-ended questions is a high-effort, high-impact task in education. Recent breakthroughs in AI technology might facilitate such automation, but this has not been demonstrated at scale. We introduce a novel automatic short answer grading (ASAG) system.
arXiv Detail & Related papers (2024-05-07T13:49:59Z)
Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box. This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z)
Multi-granulariy Time-based Transformer for Knowledge Tracing [9.788039182463768]
We leverage students historical data, including their past test scores, to create a personalized model for each student. We then use these models to predict their future performance on a given test.
arXiv Detail & Related papers (2023-04-11T14:46:38Z)
Student-centric Model of Learning Management System Activity and Academic Performance: from Correlation to Causation [2.169383034643496]
In recent years, there is a lot of interest in modeling students' digital traces in Learning Management System (LMS) to understand students' learning behavior patterns. This paper explores a student-centric analytical framework for LMS activity data that can provide not only correlational but causal insights mined from observational data. We envision that those insights will provide convincing evidence for college student support groups to launch student-centered and targeted interventions.
arXiv Detail & Related papers (2022-10-27T14:08:25Z)
Automatically Assessing Students Performance with Smartphone Data [0.7069200904392647]
We present a dataset collected using a smartphone application (ISABELA) We present several tests with different machine learning models, in order to classify students' performance. It is shown that the created models can predict student performance even with data collected from different contexts.
arXiv Detail & Related papers (2022-07-06T10:05:23Z)
A Predictive Model for Student Performance in Classrooms Using Student Interactions With an eTextbook [0.0]
This paper proposes a new model for predicting student performance based on an analysis of how students interact with an interactive online eTextbook. To build the proposed model, we evaluated the most popular classification and regression algorithms on data from a data structures and algorithms course.
arXiv Detail & Related papers (2022-02-16T11:59:53Z)
Early Performance Prediction using Interpretable Patterns in Programming Process Data [13.413990352918098]
We leverage rich, fine-grained log data to build a model to predict student course outcomes. We evaluate our approach on a dataset from 106 students in a block-based, introductory programming course.
arXiv Detail & Related papers (2021-02-10T22:46:45Z)
Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms. We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance. We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.