Identifying and examining machine learning biases on Adult dataset
- URL: http://arxiv.org/abs/2310.09373v1
- Date: Fri, 13 Oct 2023 19:41:47 GMT
- Title: Identifying and examining machine learning biases on Adult dataset
- Authors: Sahil Girhepuje
- Abstract summary: This research delves into the reduction of machine learning model bias through Ensemble Learning.
Our rigorous methodology comprehensively assesses bias across various categorical variables, ultimately revealing a pronounced gender attribute bias.
This study underscores ethical considerations and advocates the implementation of hybrid models for a data-driven society marked by inclusivity and impartiality.
- Score: 0.7856362837294112
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This research delves into the reduction of machine learning model bias
through Ensemble Learning. Our rigorous methodology comprehensively assesses
bias across various categorical variables, ultimately revealing a pronounced
gender attribute bias. The empirical evidence unveils a substantial
gender-based wage prediction disparity: wages predicted for males, initially at
\$902.91, significantly decrease to \$774.31 when the gender attribute is
alternated to females. Notably, Kullback-Leibler divergence scores point to
gender bias, with values exceeding 0.13, predominantly within tree-based
models. Employing Ensemble Learning elucidates the quest for fairness and
transparency. Intriguingly, our findings reveal that the stacked model aligns
with individual models, confirming the resilience of model bias. This study
underscores ethical considerations and advocates the implementation of hybrid
models for a data-driven society marked by impartiality and inclusivity.
Related papers
- GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.
GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z) - Dataset Distribution Impacts Model Fairness: Single vs. Multi-Task Learning [2.9530211066840417]
We evaluate the performance of skin lesion classification using ResNet-based CNNs.
We present a linear programming method for generating datasets with varying patient sex and class labels.
arXiv Detail & Related papers (2024-07-24T15:23:26Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - AI Gender Bias, Disparities, and Fairness: Does Training Data Matter? [3.509963616428399]
This study delves into the pervasive issue of gender issues in artificial intelligence (AI)
It analyzes more than 1000 human-graded student responses from male and female participants across six assessment items.
Results indicate that scoring accuracy for mixed-trained models shows an insignificant difference from either male- or female-trained models.
arXiv Detail & Related papers (2023-12-17T22:37:06Z) - Evaluating Bias and Fairness in Gender-Neutral Pretrained
Vision-and-Language Models [23.65626682262062]
We quantify bias amplification in pretraining and after fine-tuning on three families of vision-and-language models.
Overall, we find that bias amplification in pretraining and after fine-tuning are independent.
arXiv Detail & Related papers (2023-10-26T16:19:19Z) - The Impact of Debiasing on the Performance of Language Models in
Downstream Tasks is Underestimated [70.23064111640132]
We compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets.
Experiments show that the effects of debiasing are consistently emphunderestimated across all tasks.
arXiv Detail & Related papers (2023-09-16T20:25:34Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Balancing out Bias: Achieving Fairness Through Training Reweighting [58.201275105195485]
Bias in natural language processing arises from models learning characteristics of the author such as gender and race.
Existing methods for mitigating and measuring bias do not directly account for correlations between author demographics and linguistic variables.
This paper introduces a very simple but highly effective method for countering bias using instance reweighting.
arXiv Detail & Related papers (2021-09-16T23:40:28Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z) - Do Neural Ranking Models Intensify Gender Bias? [13.37092521347171]
We first provide a bias measurement framework which includes two metrics to quantify the degree of the unbalanced presence of gender-related concepts in a given IR model's ranking list.
Applying these queries to the MS MARCO Passage retrieval collection, we then measure the gender bias of a BM25 model and several recent neural ranking models.
Results show that while all models are strongly biased toward male, the neural models, and in particular the ones based on contextualized embedding models, significantly intensify gender bias.
arXiv Detail & Related papers (2020-05-01T13:31:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.