Early Detection of At-Risk Students Using Machine Learning
- URL: http://arxiv.org/abs/2412.09483v1
- Date: Thu, 12 Dec 2024 17:33:06 GMT
- Title: Early Detection of At-Risk Students Using Machine Learning
- Authors: Azucena L. Jimenez Martinez, Kanika Sood, Rakeshkumar Mahto,
- Abstract summary: We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students.<n>This work considers several machine learning models, including Support Vector Machines (SVM), Naive Bayes, K-nearest neighbors (KNN), Decision Trees, Logistic Regression, and Random Forest.<n>Our analysis indicates that all algorithms generate an acceptable outcome for at-risk student predictions, while Naive Bayes performs best overall.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This research presents preliminary work to address the challenge of identifying at-risk students using supervised machine learning and three unique data categories: engagement, demographics, and performance data collected from Fall 2023 using Canvas and the California State University, Fullerton dashboard. We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students and building a high-risk identification system. By focusing on previously overlooked behavioral factors alongside traditional metrics, this work aims to address educational gaps, enhance student outcomes, and significantly boost student success across disciplines at the University. Pre-processing steps take place to establish a target variable, anonymize student information, manage missing data, and identify the most significant features. Given the mixed data types in the datasets and the binary classification nature of this study, this work considers several machine learning models, including Support Vector Machines (SVM), Naive Bayes, K-nearest neighbors (KNN), Decision Trees, Logistic Regression, and Random Forest. These models predict at-risk students and identify critical periods of the semester when student performance is most vulnerable. We will use validation techniques such as train test split and k-fold cross-validation to ensure the reliability of the models. Our analysis indicates that all algorithms generate an acceptable outcome for at-risk student predictions, while Naive Bayes performs best overall.
Related papers
- AI-based identification and support of at-risk students: A case study of the Moroccan education system [5.199084419479099]
Student dropout is a global issue influenced by personal, familial, and academic factors.
This paper introduces an AI-driven predictive modeling approach to identify students at risk of dropping out.
arXiv Detail & Related papers (2025-04-09T13:30:35Z) - AdvKT: An Adversarial Multi-Step Training Framework for Knowledge Tracing [64.79967583649407]
Knowledge Tracing (KT) monitors students' knowledge states and simulates their responses to question sequences.
Existing KT models typically follow a single-step training paradigm, which leads to significant error accumulation.
We propose a novel Adversarial Multi-Step Training Framework for Knowledge Tracing (AdvKT) which focuses on the multi-step KT task.
arXiv Detail & Related papers (2025-04-07T03:31:57Z) - Modeling Behavior Change for Multi-model At-Risk Students Early Prediction (extended version) [10.413751893289056]
Current models primarily identify students with consistently poor performance through simple and discrete behavioural patterns.
We have developed an innovative prediction model, Multimodal- ChangePoint Detection (MCPD), utilizing the textual teacher remark data and numerical grade data from middle schools.
Our model achieves an accuracy range of 70- 75%, with an average outperforming baseline algorithms by approximately 5-10%.
arXiv Detail & Related papers (2025-02-19T11:16:46Z) - DASKT: A Dynamic Affect Simulation Method for Knowledge Tracing [51.665582274736785]
Knowledge Tracing (KT) predicts future performance by students' historical computation, and understanding students' affective states can enhance the effectiveness of KT.
We propose Affect Dynamic Knowledge Tracing (DASKT) to explore the impact of various student affective states on their knowledge states.
Our research highlights a promising avenue for future studies, focusing on achieving high interpretability and accuracy.
arXiv Detail & Related papers (2025-01-18T10:02:10Z) - Knowledge Distillation in RNN-Attention Models for Early Prediction of Student Performance [3.9596747946226767]
We introduce an RNN-Attention-KD (knowledge distillation) framework to predict at-risk students early throughout a course.
In an empirical evaluation, RNN-Attention-KD outperforms traditional neural network models in terms of recall and F1-measure.
arXiv Detail & Related papers (2024-12-19T04:46:06Z) - Detecting Unsuccessful Students in Cybersecurity Exercises in Two Different Learning Environments [0.37729165787434493]
This paper develops automated tools to predict when a student is having difficulty.
In a potential application, such models can aid instructors in detecting struggling students and providing targeted help.
arXiv Detail & Related papers (2024-08-16T04:57:54Z) - A Predictive Model using Machine Learning Algorithm in Identifying
Students Probability on Passing Semestral Course [0.0]
This study employs classification for data mining techniques, and decision tree for algorithm.
With the utilization of the newly discovered predictive model, the prediction of students probabilities to pass the current courses they take gives 0.7619 accuracy, 0.8333 precision, 0.8823 recall, and 0.8571 f1 score.
arXiv Detail & Related papers (2023-04-12T01:57:08Z) - Distantly-Supervised Named Entity Recognition with Adaptive Teacher
Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models.
In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks.
Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z) - Responsible Active Learning via Human-in-the-loop Peer Study [88.01358655203441]
We propose a responsible active learning method, namely Peer Study Learning (PSL), to simultaneously preserve data privacy and improve model stability.
We first introduce a human-in-the-loop teacher-student architecture to isolate unlabelled data from the task learner (teacher) on the cloud-side.
During training, the task learner instructs the light-weight active learner which then provides feedback on the active sampling criterion.
arXiv Detail & Related papers (2022-11-24T13:18:27Z) - Student-centric Model of Learning Management System Activity and
Academic Performance: from Correlation to Causation [2.169383034643496]
In recent years, there is a lot of interest in modeling students' digital traces in Learning Management System (LMS) to understand students' learning behavior patterns.
This paper explores a student-centric analytical framework for LMS activity data that can provide not only correlational but causal insights mined from observational data.
We envision that those insights will provide convincing evidence for college student support groups to launch student-centered and targeted interventions.
arXiv Detail & Related papers (2022-10-27T14:08:25Z) - On the Robustness of Random Forest Against Untargeted Data Poisoning: An
Ensemble-Based Approach [42.81632484264218]
In machine learning models, perturbations of fractions of the training set (poisoning) can seriously undermine the model accuracy.
This paper aims to implement a novel hash-based ensemble approach that protects random forest against untargeted, random poisoning attacks.
arXiv Detail & Related papers (2022-09-28T11:41:38Z) - SF-PATE: Scalable, Fair, and Private Aggregation of Teacher Ensembles [50.90773979394264]
This paper studies a model that protects the privacy of individuals' sensitive information while also allowing it to learn non-discriminatory predictors.
A key characteristic of the proposed model is to enable the adoption of off-the-selves and non-private fair models to create a privacy-preserving and fair model.
arXiv Detail & Related papers (2022-04-11T14:42:54Z) - Adversarial Self-Supervised Contrastive Learning [62.17538130778111]
Existing adversarial learning approaches mostly use class labels to generate adversarial samples that lead to incorrect predictions.
We propose a novel adversarial attack for unlabeled data, which makes the model confuse the instance-level identities of the perturbed data samples.
We present a self-supervised contrastive learning framework to adversarially train a robust neural network without labeled data.
arXiv Detail & Related papers (2020-06-13T08:24:33Z) - Differentially Private Deep Learning with Smooth Sensitivity [144.31324628007403]
We study privacy concerns through the lens of differential privacy.
In this framework, privacy guarantees are generally obtained by perturbing models in such a way that specifics of data used to train the model are made ambiguous.
One of the most important techniques used in previous works involves an ensemble of teacher models, which return information to a student based on a noisy voting procedure.
In this work, we propose a novel voting mechanism with smooth sensitivity, which we call Immutable Noisy ArgMax, that, under certain conditions, can bear very large random noising from the teacher without affecting the useful information transferred to the student
arXiv Detail & Related papers (2020-03-01T15:38:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.