Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation
- URL: http://arxiv.org/abs/2501.10453v1
- Date: Tue, 14 Jan 2025 19:06:37 GMT
- Title: Uncovering Bias in Foundation Models: Impact, Testing, Harm, and Mitigation
- Authors: Shuzhou Sun, Li Liu, Yongxiang Liu, Zhen Liu, Shuanghui Zhang, Janne Heikkilä, Xiang Li,
- Abstract summary: Bias in Foundation Models (FMs) poses significant challenges for fairness and equity across fields such as healthcare, education, and finance.
These biases, rooted in the overrepresentation of stereotypes and societal inequalities in training data, exacerbate real-world discrimination, reinforce harmful stereotypes, and erode trust in AI systems.
We introduce Trident Probe Testing (TriProTesting), a systematic testing method that detects explicit and implicit biases using semantically designed probes.
- Score: 26.713973033726464
- License:
- Abstract: Bias in Foundation Models (FMs) - trained on vast datasets spanning societal and historical knowledge - poses significant challenges for fairness and equity across fields such as healthcare, education, and finance. These biases, rooted in the overrepresentation of stereotypes and societal inequalities in training data, exacerbate real-world discrimination, reinforce harmful stereotypes, and erode trust in AI systems. To address this, we introduce Trident Probe Testing (TriProTesting), a systematic testing method that detects explicit and implicit biases using semantically designed probes. Here we show that FMs, including CLIP, ALIGN, BridgeTower, and OWLv2, demonstrate pervasive biases across single and mixed social attributes (gender, race, age, and occupation). Notably, we uncover mixed biases when social attributes are combined, such as gender x race, gender x age, and gender x occupation, revealing deeper layers of discrimination. We further propose Adaptive Logit Adjustment (AdaLogAdjustment), a post-processing technique that dynamically redistributes probability power to mitigate these biases effectively, achieving significant improvements in fairness without retraining models. These findings highlight the urgent need for ethical AI practices and interdisciplinary solutions to address biases not only at the model level but also in societal structures. Our work provides a scalable and interpretable solution that advances fairness in AI systems while offering practical insights for future research on fair AI technologies.
Related papers
- The Pursuit of Fairness in Artificial Intelligence Models: A Survey [2.124791625488617]
This survey offers a synopsis of the different ways researchers have promoted fairness in AI systems.
A thorough study is conducted of the approaches and techniques employed by researchers to mitigate bias in AI models.
We also delve into the impact of biased models on user experience and the ethical considerations to contemplate when developing and deploying such models.
arXiv Detail & Related papers (2024-03-26T02:33:36Z) - Fast Model Debias with Machine Unlearning [54.32026474971696]
Deep neural networks might behave in a biased manner in many real-world scenarios.
Existing debiasing methods suffer from high costs in bias labeling or model re-training.
We propose a fast model debiasing framework (FMD) which offers an efficient approach to identify, evaluate and remove biases.
arXiv Detail & Related papers (2023-10-19T08:10:57Z) - Fairness Explainability using Optimal Transport with Applications in
Image Classification [0.46040036610482665]
We propose a comprehensive approach to uncover the causes of discrimination in Machine Learning applications.
We leverage Wasserstein barycenters to achieve fair predictions and introduce an extension to pinpoint bias-associated regions.
This allows us to derive a cohesive system which uses the enforced fairness to measure each features influence emphon the bias.
arXiv Detail & Related papers (2023-08-22T00:10:23Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - D-BIAS: A Causality-Based Human-in-the-Loop System for Tackling
Algorithmic Bias [57.87117733071416]
We propose D-BIAS, a visual interactive tool that embodies human-in-the-loop AI approach for auditing and mitigating social biases.
A user can detect the presence of bias against a group by identifying unfair causal relationships in the causal network.
For each interaction, say weakening/deleting a biased causal edge, the system uses a novel method to simulate a new (debiased) dataset.
arXiv Detail & Related papers (2022-08-10T03:41:48Z) - Optimising Equal Opportunity Fairness in Model Training [60.0947291284978]
Existing debiasing methods, such as adversarial training and removing protected information from representations, have been shown to reduce bias.
We propose two novel training objectives which directly optimise for the widely-used criterion of it equal opportunity, and show that they are effective in reducing bias while maintaining high performance over two classification tasks.
arXiv Detail & Related papers (2022-05-05T01:57:58Z) - Statistical discrimination in learning agents [64.78141757063142]
Statistical discrimination emerges in agent policies as a function of both the bias in the training population and of agent architecture.
We show that less discrimination emerges with agents that use recurrent neural networks, and when their training environment has less bias.
arXiv Detail & Related papers (2021-10-21T18:28:57Z) - Improving Fairness of AI Systems with Lossless De-biasing [15.039284892391565]
Mitigating bias in AI systems to increase overall fairness has emerged as an important challenge.
We present an information-lossless de-biasing technique that targets the scarcity of data in the disadvantaged group.
arXiv Detail & Related papers (2021-05-10T17:38:38Z) - Fair Representation Learning for Heterogeneous Information Networks [35.80367469624887]
We propose a comprehensive set of de-biasing methods for fair HINs representation learning.
We study the behavior of these algorithms, especially their capability in balancing the trade-off between fairness and prediction accuracy.
We evaluate the performance of the proposed methods in an automated career counseling application.
arXiv Detail & Related papers (2021-04-18T08:28:18Z) - Representative & Fair Synthetic Data [68.8204255655161]
We present a framework to incorporate fairness constraints into the self-supervised learning process.
We generate a representative as well as fair version of the UCI Adult census data set.
We consider representative & fair synthetic data a promising future building block to teach algorithms not on historic worlds, but rather on the worlds that we strive to live in.
arXiv Detail & Related papers (2021-04-07T09:19:46Z) - No computation without representation: Avoiding data and algorithm
biases through diversity [11.12971845021808]
We draw connections between the lack of diversity within academic and professional computing fields and the type and breadth of the biases encountered in datasets.
We use these lessons to develop recommendations that provide concrete steps for the computing community to increase diversity.
arXiv Detail & Related papers (2020-02-26T23:07:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.