Predicting student performance using data from an auto-grading system
- URL: http://arxiv.org/abs/2102.01270v1
- Date: Tue, 2 Feb 2021 03:02:39 GMT
- Title: Predicting student performance using data from an auto-grading system
- Authors: Huanyi Chen, Paul A.S. Ward
- Abstract summary: We build decision-tree and linear-regression models with various features extracted from the Marmoset auto-grading system.
We show that the linear-regression model using submission time intervals performs the best among all models in terms of Precision and F-Measure.
We also show that for students who are misclassified into poor-performance students, they have the lowest actual grades in the linear-regression model among all models.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As online auto-grading systems appear, information obtained from those
systems can potentially enable researchers to create predictive models to
predict student behaviour and performances. In the University of Waterloo, the
ECE 150 (Fundamentals of Programming) Instructional Team wants to get an
insight into how to allocate the limited teaching resources better to achieve
improved educational outcomes. Currently, the Instructional Team allocates
tutoring time in a reactive basis. They help students "as-requested". This
approach serves those students with the wherewithal to request help; however,
many of the students who are struggling do not reach out for assistance.
Therefore, we, as the Research Team, want to explore if we can determine
students which need help by looking into the data from our auto-grading system,
Marmoset.
In this paper, we conducted experiments building decision-tree and
linear-regression models with various features extracted from the Marmoset
auto-grading system, including passing rate, testcase outcomes, number of
submissions and submission time intervals (the time interval between the
student's first reasonable submission and the deadline). For each feature, we
interpreted the result at the confusion matrix level. Specifically for
poor-performance students, we show that the linear-regression model using
submission time intervals performs the best among all models in terms of
Precision and F-Measure. We also show that for students who are misclassified
into poor-performance students, they have the lowest actual grades in the
linear-regression model among all models. In addition, we show that for the
midterm, the submission time interval of the last assignment before the midterm
predicts the midterm performance the most. However, for the final exam, the
midterm performance contributes the most on the final exam performance.
Related papers
- Beyond human subjectivity and error: a novel AI grading system [67.410870290301]
The grading of open-ended questions is a high-effort, high-impact task in education.
Recent breakthroughs in AI technology might facilitate such automation, but this has not been demonstrated at scale.
We introduce a novel automatic short answer grading (ASAG) system.
arXiv Detail & Related papers (2024-05-07T13:49:59Z) - Zero-shot Retrieval: Augmenting Pre-trained Models with Search Engines [83.65380507372483]
Large pre-trained models can dramatically reduce the amount of task-specific data required to solve a problem, but they often fail to capture domain-specific nuances out of the box.
This paper shows how to leverage recent advances in NLP and multi-modal learning to augment a pre-trained model with search engine retrieval.
arXiv Detail & Related papers (2023-11-29T05:33:28Z) - Multi-granulariy Time-based Transformer for Knowledge Tracing [9.788039182463768]
We leverage students historical data, including their past test scores, to create a personalized model for each student.
We then use these models to predict their future performance on a given test.
arXiv Detail & Related papers (2023-04-11T14:46:38Z) - EmbedDistill: A Geometric Knowledge Distillation for Information
Retrieval [83.79667141681418]
Large neural models (such as Transformers) achieve state-of-the-art performance for information retrieval (IR)
We propose a novel distillation approach that leverages the relative geometry among queries and documents learned by the large teacher model.
We show that our approach successfully distills from both dual-encoder (DE) and cross-encoder (CE) teacher models to 1/10th size asymmetric students that can retain 95-97% of the teacher performance.
arXiv Detail & Related papers (2023-01-27T22:04:37Z) - Student-centric Model of Learning Management System Activity and
Academic Performance: from Correlation to Causation [2.169383034643496]
In recent years, there is a lot of interest in modeling students' digital traces in Learning Management System (LMS) to understand students' learning behavior patterns.
This paper explores a student-centric analytical framework for LMS activity data that can provide not only correlational but causal insights mined from observational data.
We envision that those insights will provide convincing evidence for college student support groups to launch student-centered and targeted interventions.
arXiv Detail & Related papers (2022-10-27T14:08:25Z) - Automatically Assessing Students Performance with Smartphone Data [0.7069200904392647]
We present a dataset collected using a smartphone application (ISABELA)
We present several tests with different machine learning models, in order to classify students' performance.
It is shown that the created models can predict student performance even with data collected from different contexts.
arXiv Detail & Related papers (2022-07-06T10:05:23Z) - A Predictive Model for Student Performance in Classrooms Using Student
Interactions With an eTextbook [0.0]
This paper proposes a new model for predicting student performance based on an analysis of how students interact with an interactive online eTextbook.
To build the proposed model, we evaluated the most popular classification and regression algorithms on data from a data structures and algorithms course.
arXiv Detail & Related papers (2022-02-16T11:59:53Z) - Early Performance Prediction using Interpretable Patterns in Programming
Process Data [13.413990352918098]
We leverage rich, fine-grained log data to build a model to predict student course outcomes.
We evaluate our approach on a dataset from 106 students in a block-based, introductory programming course.
arXiv Detail & Related papers (2021-02-10T22:46:45Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - What do we expect from Multiple-choice QA Systems? [70.86513724662302]
We consider a top performing model on several Multiple Choice Question Answering (MCQA) datasets.
We evaluate it against a set of expectations one might have from such a model, using a series of zero-information perturbations of the model's inputs.
arXiv Detail & Related papers (2020-11-20T21:27:10Z) - Evaluation Toolkit For Robustness Testing Of Automatic Essay Scoring
Systems [64.4896118325552]
We evaluate the current state-of-the-art AES models using a model adversarial evaluation scheme and associated metrics.
We find that AES models are highly overstable. Even heavy modifications(as much as 25%) with content unrelated to the topic of the questions do not decrease the score produced by the models.
arXiv Detail & Related papers (2020-07-14T03:49:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.