Temporal and Between-Group Variability in College Dropout Prediction
- URL: http://arxiv.org/abs/2401.06498v1
- Date: Fri, 12 Jan 2024 10:43:55 GMT
- Title: Temporal and Between-Group Variability in College Dropout Prediction
- Authors: Dominik Glandorf, Hye Rin Lee, Gabe Avakian Orona, Marina Pumptow,
Renzhe Yu, Christian Fischer
- Abstract summary: This study provides a systematic evaluation of contributing factors and predictive performance of machine learning models.
We find dropout prediction at the end of the second year has a 20% higher AUC than at the time of enrollment in a Random Forest model.
Regarding variability across student groups, college GPA has more predictive value for students from traditionally disadvantaged backgrounds than their peers.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Large-scale administrative data is a common input in early warning systems
for college dropout in higher education. Still, the terminology and methodology
vary significantly across existing studies, and the implications of different
modeling decisions are not fully understood. This study provides a systematic
evaluation of contributing factors and predictive performance of machine
learning models over time and across different student groups. Drawing on
twelve years of administrative data at a large public university in the US, we
find that dropout prediction at the end of the second year has a 20% higher AUC
than at the time of enrollment in a Random Forest model. Also, most predictive
factors at the time of enrollment, including demographics and high school
performance, are quickly superseded in predictive importance by college
performance and in later stages by enrollment behavior. Regarding variability
across student groups, college GPA has more predictive value for students from
traditionally disadvantaged backgrounds than their peers. These results can
help researchers and administrators understand the comparative value of
different data sources when building early warning systems and optimizing
decisions under specific policy goals.
Related papers
- Trading off performance and human oversight in algorithmic policy: evidence from Danish college admissions [11.378331161188022]
Student dropout is a significant concern for educational institutions.
We show that sequential AI models offer more precise and fair predictions.
We estimate that even the use of simple AI models to guide admissions decisions could yield significant economic benefits.
arXiv Detail & Related papers (2024-11-22T21:12:54Z) - Beyond human subjectivity and error: a novel AI grading system [67.410870290301]
The grading of open-ended questions is a high-effort, high-impact task in education.
Recent breakthroughs in AI technology might facilitate such automation, but this has not been demonstrated at scale.
We introduce a novel automatic short answer grading (ASAG) system.
arXiv Detail & Related papers (2024-05-07T13:49:59Z) - Improving On-Time Undergraduate Graduation Rate For Undergraduate Students Using Predictive Analytics [0.0]
The on-time graduation rate among universities in Puerto Rico is significantly lower than in the mainland United States.
This project aims to develop a predictive model that accurately detects students early in their academic pursuit at risk of not graduating on time.
arXiv Detail & Related papers (2024-05-02T22:33:42Z) - Reduced-Rank Multi-objective Policy Learning and Optimization [57.978477569678844]
In practice, causal researchers do not have a single outcome in mind a priori.
In government-assisted social benefit programs, policymakers collect many outcomes to understand the multidimensional nature of poverty.
We present a data-driven dimensionality-reduction methodology for multiple outcomes in the context of optimal policy learning.
arXiv Detail & Related papers (2024-04-29T08:16:30Z) - Why Do Students Drop Out? University Dropout Prediction and Associated
Factor Analysis Using Machine Learning Techniques [0.5042480200195721]
This study examined university dropout prediction using academic, demographic, socioeconomic, and macroeconomic data types.
The data type most influential to the model performance was found to be academic data.
Preliminary results indicate that a correlation does exist between data types and dropout status.
arXiv Detail & Related papers (2023-10-17T04:20:00Z) - Students Success Modeling: Most Important Factors [0.47829670123819784]
The model undertakes to identify students likely to graduate, the ones likely to transfer to a different school, and the ones likely to drop out and leave their higher education unfinished.
Our experiments demonstrate that distinguishing between to-be-graduate and at-risk students is reasonably achievable in the earliest stages.
The model remarkably foresees the fate of students who stay in the school for three years.
arXiv Detail & Related papers (2023-09-06T19:23:10Z) - Ecosystem-level Analysis of Deployed Machine Learning Reveals Homogeneous Outcomes [72.13373216644021]
We study the societal impact of machine learning by considering the collection of models that are deployed in a given context.
We find deployed machine learning is prone to systemic failure, meaning some users are exclusively misclassified by all models available.
These examples demonstrate ecosystem-level analysis has unique strengths for characterizing the societal impact of machine learning.
arXiv Detail & Related papers (2023-07-12T01:11:52Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Predicting MOOCs Dropout Using Only Two Easily Obtainable Features from
the First Week's Activities [56.1344233010643]
Several features are considered to contribute towards learner attrition or lack of interest, which may lead to disengagement or total dropout.
This study aims to predict dropout early-on, from the first week, by comparing several machine-learning approaches.
arXiv Detail & Related papers (2020-08-12T10:44:49Z) - Academic Performance Estimation with Attention-based Graph Convolutional
Networks [17.985752744098267]
Given a student's past data, the task of student's performance prediction is to predict a student's grades in future courses.
Traditional methods for student's performance prediction usually neglect the underlying relationships between multiple courses.
We propose a novel attention-based graph convolutional networks model for student's performance prediction.
arXiv Detail & Related papers (2019-12-26T23:11:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.