Early Performance Prediction using Interpretable Patterns in Programming
Process Data
- URL: http://arxiv.org/abs/2102.05765v1
- Date: Wed, 10 Feb 2021 22:46:45 GMT
- Title: Early Performance Prediction using Interpretable Patterns in Programming
Process Data
- Authors: Ge Gao, Samiha Marwan and Thomas W. Price
- Abstract summary: We leverage rich, fine-grained log data to build a model to predict student course outcomes.
We evaluate our approach on a dataset from 106 students in a block-based, introductory programming course.
- Score: 13.413990352918098
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instructors have limited time and resources to help struggling students, and
these resources should be directed to the students who most need them. To
address this, researchers have constructed models that can predict students'
final course performance early in a semester. However, many predictive models
are limited to static and generic student features (e.g. demographics, GPA),
rather than computing-specific evidence that assesses a student's progress in
class. Many programming environments now capture complete time-stamped records
of students' actions during programming. In this work, we leverage this rich,
fine-grained log data to build a model to predict student course outcomes. From
the log data, we extract patterns of behaviors that are predictive of students'
success using an approach called differential sequence mining. We evaluate our
approach on a dataset from 106 students in a block-based, introductory
programming course. The patterns extracted from our approach can predict final
programming performance with 79% accuracy using only the first programming
assignment, outperforming two baseline methods. In addition, we show that the
patterns are interpretable and correspond to concrete, effective -- and
ineffective -- novice programming behaviors. We also discuss these patterns and
their implications for classroom instruction.
Related papers
- Data Shapley in One Training Run [88.59484417202454]
Data Shapley provides a principled framework for attributing data's contribution within machine learning contexts.
Existing approaches require re-training models on different data subsets, which is computationally intensive.
This paper introduces In-Run Data Shapley, which addresses these limitations by offering scalable data attribution for a target model of interest.
arXiv Detail & Related papers (2024-06-16T17:09:24Z) - Learning-Augmented Algorithms with Explicit Predictors [67.02156211760415]
Recent advances in algorithmic design show how to utilize predictions obtained by machine learning models from past and present data.
Prior research in this context was focused on a paradigm where the predictor is pre-trained on past data and then used as a black box.
In this work, we unpack the predictor and integrate the learning problem it gives rise for within the algorithmic challenge.
arXiv Detail & Related papers (2024-03-12T08:40:21Z) - Manipulating Predictions over Discrete Inputs in Machine Teaching [43.914943603238996]
This paper focuses on machine teaching in the discrete domain, specifically on manipulating student models' predictions based on the goals of teachers via changing the training data efficiently.
We formulate this task as a optimization problem and solve it by proposing an iterative searching algorithm.
Our algorithm demonstrates significant numerical merit in the scenarios where a teacher attempts at correcting erroneous predictions to improve the student's models, or maliciously manipulating the model to misclassify some specific samples to the target class aligned with his personal profits.
arXiv Detail & Related papers (2024-01-31T14:23:51Z) - A Predictive Model using Machine Learning Algorithm in Identifying
Students Probability on Passing Semestral Course [0.0]
This study employs classification for data mining techniques, and decision tree for algorithm.
With the utilization of the newly discovered predictive model, the prediction of students probabilities to pass the current courses they take gives 0.7619 accuracy, 0.8333 precision, 0.8823 recall, and 0.8571 f1 score.
arXiv Detail & Related papers (2023-04-12T01:57:08Z) - Personalized Student Attribute Inference [0.0]
This work is to create a system able to automatically detect students in difficulty, for instance predicting if they are likely to fail a course.
We compare a naive approach widely used in the literature, which uses attributes available in the data set (like the grades) with a personalized approach we called Personalized Student Attribute Inference (IPSA)
arXiv Detail & Related papers (2022-12-26T23:00:28Z) - Process-BERT: A Framework for Representation Learning on Educational
Process Data [68.8204255655161]
We propose a framework for learning representations of educational process data.
Our framework consists of a pre-training step that uses BERT-type objectives to learn representations from sequential process data.
We apply our framework to the 2019 nation's report card data mining competition dataset.
arXiv Detail & Related papers (2022-04-28T16:07:28Z) - Efficient Sub-structured Knowledge Distillation [52.5931565465661]
We propose an approach that is much simpler in its formulation and far more efficient for training than existing approaches.
We transfer the knowledge from a teacher model to its student model by locally matching their predictions on all sub-structures, instead of the whole output space.
arXiv Detail & Related papers (2022-03-09T15:56:49Z) - Non-Clairvoyant Scheduling with Predictions Revisited [77.86290991564829]
In non-clairvoyant scheduling, the task is to find an online strategy for scheduling jobs with a priori unknown processing requirements.
We revisit this well-studied problem in a recently popular learning-augmented setting that integrates (untrusted) predictions in algorithm design.
We show that these predictions have desired properties, admit a natural error measure as well as algorithms with strong performance guarantees.
arXiv Detail & Related papers (2022-02-21T13:18:11Z) - Train No Evil: Selective Masking for Task-Guided Pre-Training [97.03615486457065]
We propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning.
We show that our method can achieve comparable or even better performance with less than 50% of cost.
arXiv Detail & Related papers (2020-04-21T03:14:22Z) - Context-aware Non-linear and Neural Attentive Knowledge-based Models for
Grade Prediction [12.592903558338444]
Grade prediction for future courses not yet taken by students is important as it can help them and their advisers during the process of course selection.
One of the successful approaches for accurately predicting a student's grades in future courses is Cumulative Knowledge-based Regression Models (CKRM)
CKRM learns shallow linear models that predict a student's grades as the similarity between his/her knowledge state and the target course.
We propose context-aware non-linear and neural attentive models that can potentially better estimate a student's knowledge state from his/her prior course information.
arXiv Detail & Related papers (2020-03-09T20:20:48Z) - Academic Performance Estimation with Attention-based Graph Convolutional
Networks [17.985752744098267]
Given a student's past data, the task of student's performance prediction is to predict a student's grades in future courses.
Traditional methods for student's performance prediction usually neglect the underlying relationships between multiple courses.
We propose a novel attention-based graph convolutional networks model for student's performance prediction.
arXiv Detail & Related papers (2019-12-26T23:11:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.