Should College Dropout Prediction Models Include Protected Attributes?
- URL: http://arxiv.org/abs/2103.15237v2
- Date: Fri, 16 Apr 2021 18:53:17 GMT
- Title: Should College Dropout Prediction Models Include Protected Attributes?
- Authors: Renzhe Yu, Hansol Lee, Ren\'e F. Kizilcec
- Abstract summary: We build machine learning models to predict student dropout after one academic year.
We compare the overall performance and fairness of model predictions with or without four protected attributes.
We find that including protected attributes does not impact the overall prediction performance and it only marginally improves algorithmic fairness of predictions.
- Score: 0.4125187280299248
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Early identification of college dropouts can provide tremendous value for
improving student success and institutional effectiveness, and predictive
analytics are increasingly used for this purpose. However, ethical concerns
have emerged about whether including protected attributes in the prediction
models discriminates against underrepresented student groups and exacerbates
existing inequities. We examine this issue in the context of a large U.S.
research university with both residential and fully online degree-seeking
students. Based on comprehensive institutional records for this entire student
population across multiple years, we build machine learning models to predict
student dropout after one academic year of study, and compare the overall
performance and fairness of model predictions with or without four protected
attributes (gender, URM, first-generation student, and high financial need). We
find that including protected attributes does not impact the overall prediction
performance and it only marginally improves algorithmic fairness of
predictions. While these findings suggest that including protected attributes
is preferred, our analysis also offers guidance on how to evaluate the impact
in a local context, where institutional stakeholders seek to leverage
predictive analytics to support student success.
Related papers
- Trading off performance and human oversight in algorithmic policy: evidence from Danish college admissions [11.378331161188022]
Student dropout is a significant concern for educational institutions.
We show that sequential AI models offer more precise and fair predictions.
We estimate that even the use of simple AI models to guide admissions decisions could yield significant economic benefits.
arXiv Detail & Related papers (2024-11-22T21:12:54Z) - Performative Prediction on Games and Mechanism Design [69.7933059664256]
We study a collective risk dilemma where agents decide whether to trust predictions based on past accuracy.
As predictions shape collective outcomes, social welfare arises naturally as a metric of concern.
We show how to achieve better trade-offs and use them for mechanism design.
arXiv Detail & Related papers (2024-08-09T16:03:44Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Difficult Lessons on Social Prediction from Wisconsin Public Schools [32.90759447739759]
Early warning systems assist in targeting interventions to individual students by predicting which students are at risk of dropping out.
Despite significant investments in their widespread adoption, there remain large gaps in our understanding of the efficacy of EWS.
We present empirical evidence that the prediction system accurately sorts students by their dropout risk.
We find that it may have caused a single-digit percentage increase in graduation rates, though our empirical analyses cannot reliably rule out that there has been no positive treatment effect.
arXiv Detail & Related papers (2023-04-13T00:59:12Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - SF-PATE: Scalable, Fair, and Private Aggregation of Teacher Ensembles [50.90773979394264]
This paper studies a model that protects the privacy of individuals' sensitive information while also allowing it to learn non-discriminatory predictors.
A key characteristic of the proposed model is to enable the adoption of off-the-selves and non-private fair models to create a privacy-preserving and fair model.
arXiv Detail & Related papers (2022-04-11T14:42:54Z) - Selecting the suitable resampling strategy for imbalanced data
classification regarding dataset properties [62.997667081978825]
In many application domains such as medicine, information retrieval, cybersecurity, social media, etc., datasets used for inducing classification models often have an unequal distribution of the instances of each class.
This situation, known as imbalanced data classification, causes low predictive performance for the minority class examples.
Oversampling and undersampling techniques are well-known strategies to deal with this problem by balancing the number of examples of each class.
arXiv Detail & Related papers (2021-12-15T18:56:39Z) - Who will dropout from university? Academic risk prediction based on
interpretable machine learning [0.0]
It predicts academic risk based on the LightGBM model and the interpretable machine learning method of Shapley value.
From the local perspective, the factors affecting academic risk vary from person to person.
arXiv Detail & Related papers (2021-12-02T09:43:31Z) - A Fairness Analysis on Private Aggregation of Teacher Ensembles [31.388212637482365]
The Private Aggregation of Teacher Ensembles (PATE) is an important private machine learning framework.
This paper asks whether this privacy-preserving framework introduces or exacerbates bias and unfairness.
It shows that PATE can introduce accuracy disparity among individuals and groups of individuals.
arXiv Detail & Related papers (2021-09-17T16:19:24Z) - Differentially Private and Fair Deep Learning: A Lagrangian Dual
Approach [54.32266555843765]
This paper studies a model that protects the privacy of the individuals sensitive information while also allowing it to learn non-discriminatory predictors.
The method relies on the notion of differential privacy and the use of Lagrangian duality to design neural networks that can accommodate fairness constraints.
arXiv Detail & Related papers (2020-09-26T10:50:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.