Towards Socially Responsible AI: Cognitive Bias-Aware Multi-Objective
Learning
- URL: http://arxiv.org/abs/2005.06618v2
- Date: Tue, 28 Jul 2020 07:20:09 GMT
- Title: Towards Socially Responsible AI: Cognitive Bias-Aware Multi-Objective
Learning
- Authors: Procheta Sen, Debasis Ganguly
- Abstract summary: Human society had a long history of suffering from cognitive biases leading to social prejudices and mass injustice.
We propose a bias-aware multi-objective learning framework that learns to reduce the frequency of predicting certain combinations of them.
- Score: 24.522730093209262
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human society had a long history of suffering from cognitive biases leading
to social prejudices and mass injustice. The prevalent existence of cognitive
biases in large volumes of historical data can pose a threat of being
manifested as unethical and seemingly inhuman predictions as outputs of AI
systems trained on such data. To alleviate this problem, we propose a
bias-aware multi-objective learning framework that given a set of identity
attributes (e.g. gender, ethnicity etc.) and a subset of sensitive categories
of the possible classes of prediction outputs, learns to reduce the frequency
of predicting certain combinations of them, e.g. predicting stereotypes such as
`most blacks use abusive language', or `fear is a virtue of women'. Our
experiments conducted on an emotion prediction task with balanced class priors
shows that a set of baseline bias-agnostic models exhibit cognitive biases with
respect to gender, such as women are prone to be afraid whereas men are more
prone to be angry. In contrast, our proposed bias-aware multi-objective
learning methodology is shown to reduce such biases in the predictied emotions.
Related papers
- The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [58.130894823145205]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias.
Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning.
We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z) - Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs) [82.57490175399693]
We study gender bias in 22 popular image-to-text vision-language assistants (VLAs)
Our results show that VLAs replicate human biases likely present in the data, such as real-world occupational imbalances.
To eliminate the gender bias in these models, we find that finetuning-based debiasing methods achieve the best tradeoff between debiasing and retaining performance on downstream tasks.
arXiv Detail & Related papers (2024-10-25T05:59:44Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Fairness in AI Systems: Mitigating gender bias from language-vision
models [0.913755431537592]
We study the extent of the impact of gender bias in existing datasets.
We propose a methodology to mitigate its impact in caption based language vision models.
arXiv Detail & Related papers (2023-05-03T04:33:44Z) - Blacks is to Anger as Whites is to Joy? Understanding Latent Affective
Bias in Large Pre-trained Neural Language Models [3.5278693565908137]
"Affective Bias" is biased association of emotions towards a particular gender, race, and religion.
We show the existence of statistically significant affective bias in the PLM based emotion detection systems.
arXiv Detail & Related papers (2023-01-21T20:23:09Z) - Assessing Gender Bias in Predictive Algorithms using eXplainable AI [1.9798034349981162]
Predictive algorithms have a powerful potential to offer benefits in areas as varied as medicine or education.
They can inherit the bias and prejudices present in humans.
The outcomes can systematically repeat errors that create unfair results.
arXiv Detail & Related papers (2022-03-19T07:47:45Z) - Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases.
We propose steps towards mitigating social biases during text generation.
Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z) - Responsible AI: Gender bias assessment in emotion recognition [6.833826997240138]
This research work aims to study a gender bias in deep learning methods for facial expression recognition.
More biased neural networks show bigger accuracy gap in emotion recognition between male and female test sets.
arXiv Detail & Related papers (2021-03-21T17:00:21Z) - Image Representations Learned With Unsupervised Pre-Training Contain
Human-like Biases [3.0349733976070015]
We develop a novel method for quantifying biased associations between representations of social concepts and attributes in images.
We find that state-of-the-art unsupervised models trained on ImageNet, a popular benchmark image dataset, automatically learn racial, gender, and intersectional biases.
arXiv Detail & Related papers (2020-10-28T15:55:49Z) - Multi-Dimensional Gender Bias Classification [67.65551687580552]
Machine learning models can inadvertently learn socially undesirable patterns when training on gender biased text.
We propose a general framework that decomposes gender bias in text along several pragmatic and semantic dimensions.
Using this fine-grained framework, we automatically annotate eight large scale datasets with gender information.
arXiv Detail & Related papers (2020-05-01T21:23:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.