Related papers: Early Performance Prediction using Interpretable Patterns in Programming Process Data

Early Performance Prediction using Interpretable Patterns in Programming Process Data

URL: http://arxiv.org/abs/2102.05765v1
Date: Wed, 10 Feb 2021 22:46:45 GMT
Title: Early Performance Prediction using Interpretable Patterns in Programming Process Data
Authors: Ge Gao, Samiha Marwan and Thomas W. Price
Abstract summary: We leverage rich, fine-grained log data to build a model to predict student course outcomes. We evaluate our approach on a dataset from 106 students in a block-based, introductory programming course.
Score: 13.413990352918098
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Instructors have limited time and resources to help struggling students, and these resources should be directed to the students who most need them. To address this, researchers have constructed models that can predict students' final course performance early in a semester. However, many predictive models are limited to static and generic student features (e.g. demographics, GPA), rather than computing-specific evidence that assesses a student's progress in class. Many programming environments now capture complete time-stamped records of students' actions during programming. In this work, we leverage this rich, fine-grained log data to build a model to predict student course outcomes. From the log data, we extract patterns of behaviors that are predictive of students' success using an approach called differential sequence mining. We evaluate our approach on a dataset from 106 students in a block-based, introductory programming course. The patterns extracted from our approach can predict final programming performance with 79% accuracy using only the first programming assignment, outperforming two baseline methods. In addition, we show that the patterns are interpretable and correspond to concrete, effective -- and ineffective -- novice programming behaviors. We also discuss these patterns and their implications for classroom instruction.

Related papers

Predicting At-Risk Programming Students in Small Imbalanced Datasets using Synthetic Data [0.0]
This study is part of a larger project focused on measuring, understanding, and improving student engagement in programming education.<n>We investigate whether synthetic data generation can help identify at-risk students earlier in a small, imbalanced dataset from an introductory programming module.
arXiv Detail & Related papers (2025-05-21T23:14:25Z)
Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts. Existing approaches require re-training models on different data subsets, which is computationally intensive. This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z)
Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data. Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box. In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z)
Manipulating Predictions over Discrete Inputs in Machine Teaching [43.914943603238996]
This paper focuses on machine teaching in the discrete domain, specifically on manipulating student models' predictions based on the goals of teachers via changing the training data efficiently. We formulate this task as a optimization problem and solve it by proposing an iterative searching algorithm. Our algorithm demonstrates significant numerical merit in the scenarios where a teacher attempts at correcting erroneous predictions to improve the student's models, or maliciously manipulating the model to misclassify some specific samples to the target class aligned with his personal profits.
arXiv Detail & Related papers (2024-01-31T14:23:51Z)
A Predictive Model using Machine Learning Algorithm in Identifying Students Probability on Passing Semestral Course [0.0]
This study employs classification for data mining techniques, and decision tree for algorithm. With the utilization of the newly discovered predictive model, the prediction of students probabilities to pass the current courses they take gives 0.7619 accuracy, 0.8333 precision, 0.8823 recall, and 0.8571 f1 score.
arXiv Detail & Related papers (2023-04-12T01:57:08Z)
Personalized Student Attribute Inference [0.0]
This work is to create a system able to automatically detect students in difficulty, for instance predicting if they are likely to fail a course. We compare a naive approach widely used in the literature, which uses attributes available in the data set (like the grades) with a personalized approach we called Personalized Student Attribute Inference (IPSA)
arXiv Detail & Related papers (2022-12-26T23:00:28Z)
Process-BERT: A Framework for Representation Learning on Educational Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data. Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data. We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z)
Efficient Sub-structured Knowledge Distillation [52.5931565465661]
We propose an approach that is much simpler in its formulation and far more efficient for training than existing approaches. We transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.
arXiv Detail & Related papers (2022-03-09T15:56:49Z)
Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements. We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design. We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z)
Train No Evil: Selective Masking for Task-Guided Pre-Training [97.03615486457065]
We propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. We show that our method can achieve comparable or even better performance with less than 50% of cost.
arXiv Detail & Related papers (2020-04-21T03:14:22Z)
Context-aware Non-linear and Neural Attentive Knowledge-based Models for Grade Prediction [12.592903558338444]
Grade prediction for future courses not yet taken by students is important as it can help them and their advisers during the process of course selection. One of the successful approaches for accurately predicting a student's grades in future courses is Cumulative Knowledge-based Regression Models (CKRM) CKRM learns shallow linear models that predict a student's grades as the similarity between his/her knowledge state and the target course. We propose context-aware non-linear and neural attentive models that can potentially better estimate a student's knowledge state from his/her prior course information.
arXiv Detail & Related papers (2020-03-09T20:20:48Z)
Academic Performance Estimation with Attention-based Graph Convolutional Networks [17.985752744098267]
Given a student's past data, the task of student's performance prediction is to predict a student's grades in future courses. Traditional methods for student's performance prediction usually neglect the underlying relationships between multiple courses. We propose a novel attention-based graph convolutional networks model for student's performance prediction.
arXiv Detail & Related papers (2019-12-26T23:11:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.