Decoding Demographic un-fairness from Indian Names
- URL: http://arxiv.org/abs/2209.03089v1
- Date: Wed, 7 Sep 2022 11:54:49 GMT
- Title: Decoding Demographic un-fairness from Indian Names
- Authors: Medidoddi Vahini, Jalend Bantupalli, Souvic Chakraborty, and Animesh
Mukherjee
- Abstract summary: Demographic classification is essential in fairness assessment in recommender systems or in measuring unintended bias in online networks and voting systems.
We collect three publicly available datasets to train state-of-the-art classifiers in the domain of gender and caste classification.
We perform cross-testing (training and testing on different datasets) to understand the efficacy of the above models.
- Score: 4.402336973466853
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Demographic classification is essential in fairness assessment in recommender
systems or in measuring unintended bias in online networks and voting systems.
Important fields like education and politics, which often lay a foundation for
the future of equality in society, need scrutiny to design policies that can
better foster equality in resource distribution constrained by the unbalanced
demographic distribution of people in the country.
We collect three publicly available datasets to train state-of-the-art
classifiers in the domain of gender and caste classification. We train the
models in the Indian context, where the same name can have different styling
conventions (Jolly Abraham/Kumar Abhishikta in one state may be written as
Abraham Jolly/Abishikta Kumar in the other). Finally, we also perform
cross-testing (training and testing on different datasets) to understand the
efficacy of the above models.
We also perform an error analysis of the prediction models. Finally, we
attempt to assess the bias in the existing Indian system as case studies and
find some intriguing patterns manifesting in the complex demographic layout of
the sub-continent across the dimensions of gender and caste.
Related papers
- VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
VLBiasBench is a benchmark aimed at evaluating biases in Large Vision-Language Models (LVLMs)
We construct a dataset encompassing nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status and two intersectional bias categories (race x gender, and race x social economic status)
We conduct extensive evaluations on 15 open-source models as well as one advanced closed-source model, providing some new insights into the biases revealing from these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - Leveraging Diffusion Perturbations for Measuring Fairness in Computer
Vision [25.414154497482162]
We demonstrate that diffusion models can be leveraged to create such a dataset.
We benchmark several vision-language models on a multi-class occupation classification task.
We find that images generated with non-Caucasian labels have a significantly higher occupation misclassification rate than images generated with Caucasian labels.
arXiv Detail & Related papers (2023-11-25T19:40:13Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Fairness in AI Systems: Mitigating gender bias from language-vision
models [0.913755431537592]
We study the extent of the impact of gender bias in existing datasets.
We propose a methodology to mitigate its impact in caption based language vision models.
arXiv Detail & Related papers (2023-05-03T04:33:44Z) - Fairness meets Cross-Domain Learning: a new perspective on Models and
Metrics [80.07271410743806]
We study the relationship between cross-domain learning (CD) and model fairness.
We introduce a benchmark on face and medical images spanning several demographic groups as well as classification and localization tasks.
Our study covers 14 CD approaches alongside three state-of-the-art fairness algorithms and shows how the former can outperform the latter.
arXiv Detail & Related papers (2023-03-25T09:34:05Z) - Are Models Trained on Indian Legal Data Fair? [20.162205920441895]
We present an initial investigation of fairness from the Indian perspective in the legal domain.
We show that a decision tree model trained for the bail prediction task has an overall fairness disparity of 0.237 between input features associated with Hindus and Muslims.
arXiv Detail & Related papers (2023-03-13T16:20:33Z) - Debiasing Vision-Language Models via Biased Prompts [79.04467131711775]
We propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding.
We show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models.
arXiv Detail & Related papers (2023-01-31T20:09:33Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - The Birth of Bias: A case study on the evolution of gender bias in an
English language model [1.6344851071810076]
We use a relatively small language model, using the LSTM architecture trained on an English Wikipedia corpus.
We find that the representation of gender is dynamic and identify different phases during training.
We show that gender information is represented increasingly locally in the input embeddings of the model.
arXiv Detail & Related papers (2022-07-21T00:59:04Z) - Assessing Demographic Bias Transfer from Dataset to Model: A Case Study
in Facial Expression Recognition [1.5340540198612824]
Two metrics focus on the representational and stereotypical bias of the dataset, and the third one on the residual bias of the trained model.
We demonstrate the usefulness of the metrics by applying them to a FER problem based on the popular Affectnet dataset.
arXiv Detail & Related papers (2022-05-20T09:40:42Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.