Is Your Model "MADD"? A Novel Metric to Evaluate Algorithmic Fairness
for Predictive Student Models
- URL: http://arxiv.org/abs/2305.15342v2
- Date: Fri, 21 Jul 2023 08:51:09 GMT
- Title: Is Your Model "MADD"? A Novel Metric to Evaluate Algorithmic Fairness
for Predictive Student Models
- Authors: M\'elina Verger, S\'ebastien Lall\'e, Fran\c{c}ois Bouchet, Vanda
Luengo
- Abstract summary: We propose a novel metric, the Model Absolute Density Distance (MADD), to analyze models' discriminatory behaviors.
We evaluate our approach on the common task of predicting student success in online courses, using several common predictive classification models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Predictive student models are increasingly used in learning environments due
to their ability to enhance educational outcomes and support stakeholders in
making informed decisions. However, predictive models can be biased and produce
unfair outcomes, leading to potential discrimination against some students and
possible harmful long-term implications. This has prompted research on fairness
metrics meant to capture and quantify such biases. Nonetheless, so far,
existing fairness metrics used in education are predictive
performance-oriented, focusing on assessing biased outcomes across groups of
students, without considering the behaviors of the models nor the severity of
the biases in the outcomes. Therefore, we propose a novel metric, the Model
Absolute Density Distance (MADD), to analyze models' discriminatory behaviors
independently from their predictive performance. We also provide a
complementary visualization-based analysis to enable fine-grained human
assessment of how the models discriminate between groups of students. We
evaluate our approach on the common task of predicting student success in
online courses, using several common predictive classification models on an
open educational dataset. We also compare our metric to the only predictive
performance-oriented fairness metric developed in education, ABROCA. Results on
this dataset show that: (1) fair predictive performance does not guarantee fair
models' behaviors and thus fair outcomes, (2) there is no direct relationship
between data bias and predictive performance bias nor discriminatory behaviors
bias, and (3) trained on the same data, models exhibit different discriminatory
behaviors, according to different sensitive features too. We thus recommend
using the MADD on models that show satisfying predictive performance, to gain a
finer-grained understanding on how they behave and to refine models selection
and their usage.
Related papers
- A Fair Post-Processing Method based on the MADD Metric for Predictive Student Models [1.055551340663609]
A new metric has been developed to evaluate algorithmic fairness in predictive student models.
In this paper, we develop a post-processing method that aims at improving the fairness while preserving the accuracy of relevant predictive models' results.
We experiment with our approach on the task of predicting student success in an online course, using both simulated and real-world educational data.
arXiv Detail & Related papers (2024-07-07T14:53:41Z) - When Fairness Meets Privacy: Exploring Privacy Threats in Fair Binary Classifiers via Membership Inference Attacks [17.243744418309593]
We propose an efficient MIA method against fairness-enhanced models based on fairness discrepancy results.
We also explore potential strategies for mitigating privacy leakages.
arXiv Detail & Related papers (2023-11-07T10:28:17Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Think Twice: Measuring the Efficiency of Eliminating Prediction
Shortcuts of Question Answering Models [3.9052860539161918]
We propose a simple method for measuring a scale of models' reliance on any identified spurious feature.
We assess the robustness towards a large set of known and newly found prediction biases for various pre-trained models and debiasing methods in Question Answering (QA)
We find that while existing debiasing methods can mitigate reliance on a chosen spurious feature, the OOD performance gains of these methods can not be explained by mitigated reliance on biased features.
arXiv Detail & Related papers (2023-05-11T14:35:00Z) - Delving into Identify-Emphasize Paradigm for Combating Unknown Bias [52.76758938921129]
We propose an effective bias-conflicting scoring method (ECS) to boost the identification accuracy.
We also propose gradient alignment (GA) to balance the contributions of the mined bias-aligned and bias-conflicting samples.
Experiments are conducted on multiple datasets in various settings, demonstrating that the proposed solution can mitigate the impact of unknown biases.
arXiv Detail & Related papers (2023-02-22T14:50:24Z) - Guide the Learner: Controlling Product of Experts Debiasing Method Based
on Token Attribution Similarities [17.082695183953486]
A popular workaround is to train a robust model by re-weighting training examples based on a secondary biased model.
Here, the underlying assumption is that the biased model resorts to shortcut features.
We introduce a fine-tuning strategy that incorporates the similarity between the main and biased model attribution scores in a Product of Experts loss function.
arXiv Detail & Related papers (2023-02-06T15:21:41Z) - Cross-model Fairness: Empirical Study of Fairness and Ethics Under Model Multiplicity [10.144058870887061]
We argue that individuals can be harmed when one predictor is chosen ad hoc from a group of equally well performing models.
Our findings suggest that such unfairness can be readily found in real life and it may be difficult to mitigate by technical means alone.
arXiv Detail & Related papers (2022-03-14T14:33:39Z) - Characterizing Fairness Over the Set of Good Models Under Selective
Labels [69.64662540443162]
We develop a framework for characterizing predictive fairness properties over the set of models that deliver similar overall performance.
We provide tractable algorithms to compute the range of attainable group-level predictive disparities.
We extend our framework to address the empirically relevant challenge of selectively labelled data.
arXiv Detail & Related papers (2021-01-02T02:11:37Z) - Models, Pixels, and Rewards: Evaluating Design Trade-offs in Visual
Model-Based Reinforcement Learning [109.74041512359476]
We study a number of design decisions for the predictive model in visual MBRL algorithms.
We find that a range of design decisions that are often considered crucial, such as the use of latent spaces, have little effect on task performance.
We show how this phenomenon is related to exploration and how some of the lower-scoring models on standard benchmarks will perform the same as the best-performing models when trained on the same training data.
arXiv Detail & Related papers (2020-12-08T18:03:21Z) - Learning from others' mistakes: Avoiding dataset biases without modeling
them [111.17078939377313]
State-of-the-art natural language processing (NLP) models often learn to model dataset biases and surface form correlations instead of features that target the intended task.
Previous work has demonstrated effective methods to circumvent these issues when knowledge of the bias is available.
We show a method for training models that learn to ignore these problematic correlations.
arXiv Detail & Related papers (2020-12-02T16:10:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.