Evaluating Splitting Approaches in the Context of Student Dropout
Prediction
- URL: http://arxiv.org/abs/2305.08600v1
- Date: Mon, 15 May 2023 12:30:11 GMT
- Title: Evaluating Splitting Approaches in the Context of Student Dropout
Prediction
- Authors: Bruno de M. Barros, Hugo A. D. do Nascimento, Raphael Guedes, Sandro
E. Monsueto
- Abstract summary: We study strategies for splitting and using academic data in order to create training and testing sets.
The study indicates that a temporal splitting combined with a time-based selection of the students' incremental academic histories leads to the best strategy for the problem in question.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The prediction of academic dropout, with the aim of preventing it, is one of
the current challenges of higher education institutions. Machine learning
techniques are a great ally in this task. However, attention is needed in the
way that academic data are used by such methods, so that it reflects the
reality of the prediction problem under study and allows achieving good
results. In this paper, we study strategies for splitting and using academic
data in order to create training and testing sets. Through a conceptual
analysis and experiments with data from a public higher education institution,
we show that a random proportional data splitting, and even a simple temporal
splitting are not suitable for dropout prediction. The study indicates that a
temporal splitting combined with a time-based selection of the students'
incremental academic histories leads to the best strategy for the problem in
question.
Related papers
- A step towards the integration of machine learning and small area
estimation [0.0]
We propose a predictor supported by machine learning algorithms which can be used to predict any population or subpopulation characteristics.
We study only small departures from the assumed model, to show that our proposal is a good alternative in this case as well.
What is more, we propose the method of the accuracy estimation of machine learning predictors, giving the possibility of the accuracy comparison with classic methods.
arXiv Detail & Related papers (2024-02-12T09:43:17Z) - Hierarchical Decomposition of Prompt-Based Continual Learning:
Rethinking Obscured Sub-optimality [55.88910947643436]
Self-supervised pre-training is essential for handling vast quantities of unlabeled data in practice.
HiDe-Prompt is an innovative approach that explicitly optimize the hierarchical components with an ensemble of task-specific prompts and statistics.
Our experiments demonstrate the superior performance of HiDe-Prompt and its robustness to pre-training paradigms in continual learning.
arXiv Detail & Related papers (2023-10-11T06:51:46Z) - Re-thinking Data Availablity Attacks Against Deep Neural Networks [53.64624167867274]
In this paper, we re-examine the concept of unlearnable examples and discern that the existing robust error-minimizing noise presents an inaccurate optimization objective.
We introduce a novel optimization paradigm that yields improved protection results with reduced computational time requirements.
arXiv Detail & Related papers (2023-05-18T04:03:51Z) - A Survey on Dropout Methods and Experimental Verification in
Recommendation [34.557554809126415]
Overfitting is a common problem in machine learning, which means the model too closely fits the training data while performing poorly in the test data.
Among various methods of coping with overfitting, dropout is one of the representative ways.
From randomly dropping neurons to dropping neural structures, dropout has achieved great success in improving model performances.
arXiv Detail & Related papers (2022-04-05T07:08:21Z) - Evaluation Methods and Measures for Causal Learning Algorithms [33.07234268724662]
We focus on the two fundamental causal-inference tasks and causality-aware machine learning tasks.
The survey seeks to bring to the forefront the urgency of developing publicly available benchmarks and consensus-building standards for causal learning evaluation with observational data.
arXiv Detail & Related papers (2022-02-07T00:24:34Z) - Tri-Branch Convolutional Neural Networks for Top-$k$ Focused Academic
Performance Prediction [28.383922154797315]
Academic performance prediction aims to leverage student-related information to predict their future academic outcomes.
In this paper, we analyze students' daily behavior trajectories, which can be comprehensively tracked with campus smartcard records.
We propose a novel Tri-Branch CNN architecture, which is equipped with row-wise, column-wise, and depth-wise convolution and attention operations.
arXiv Detail & Related papers (2021-07-22T02:35:36Z) - One-shot Learning for Temporal Knowledge Graphs [49.41854171118697]
We propose a one-shot learning framework for link prediction in temporal knowledge graphs.
Our proposed method employs a self-attention mechanism to effectively encode temporal interactions between entities.
Our experiments show that the proposed algorithm outperforms the state of the art baselines for two well-studied benchmarks.
arXiv Detail & Related papers (2020-10-23T03:24:44Z) - Double Robust Representation Learning for Counterfactual Prediction [68.78210173955001]
We propose a novel scalable method to learn double-robust representations for counterfactual predictions.
We make robust and efficient counterfactual predictions for both individual and average treatment effects.
The algorithm shows competitive performance with the state-of-the-art on real world and synthetic data.
arXiv Detail & Related papers (2020-10-15T16:39:26Z) - Counterfactual Predictions under Runtime Confounding [74.90756694584839]
We study the counterfactual prediction task in the setting where all relevant factors are captured in the historical data.
We propose a doubly-robust procedure for learning counterfactual prediction models in this setting.
arXiv Detail & Related papers (2020-06-30T15:49:05Z) - On the Loss Landscape of Adversarial Training: Identifying Challenges
and How to Overcome Them [57.957466608543676]
We analyze the influence of adversarial training on the loss landscape of machine learning models.
We show that the adversarial loss landscape is less favorable to optimization, due to increased curvature and more scattered gradients.
arXiv Detail & Related papers (2020-06-15T13:50:23Z) - Academic Performance Estimation with Attention-based Graph Convolutional
Networks [17.985752744098267]
Given a student's past data, the task of student's performance prediction is to predict a student's grades in future courses.
Traditional methods for student's performance prediction usually neglect the underlying relationships between multiple courses.
We propose a novel attention-based graph convolutional networks model for student's performance prediction.
arXiv Detail & Related papers (2019-12-26T23:11:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.