Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries
- URL: http://arxiv.org/abs/2502.05849v1
- Date: Sun, 09 Feb 2025 10:54:11 GMT
- Title: Fact-or-Fair: A Checklist for Behavioral Testing of AI Models on Fairness-Related Queries
- Authors: Jen-tse Huang, Yuhang Yan, Linqi Liu, Yixin Wan, Wenxuan Wang, Kai-Wei Chang, Michael R. Lyu,
- Abstract summary: In this study, we focus on 19 real-world statistics collected from authoritative sources.
We develop a checklist comprising objective and subjective queries to analyze behavior of large language models.
We propose metrics to assess factuality and fairness, and formally prove the inherent trade-off between these two aspects.
- Score: 85.909363478929
- License:
- Abstract: The generation of incorrect images, such as depictions of people of color in Nazi-era uniforms by Gemini, frustrated users and harmed Google's reputation, motivating us to investigate the relationship between accurately reflecting factuality and promoting diversity and equity. In this study, we focus on 19 real-world statistics collected from authoritative sources. Using these statistics, we develop a checklist comprising objective and subjective queries to analyze behavior of large language models (LLMs) and text-to-image (T2I) models. Objective queries assess the models' ability to provide accurate world knowledge. In contrast, the design of subjective queries follows a key principle: statistical or experiential priors should not be overgeneralized to individuals, ensuring that models uphold diversity. These subjective queries are based on three common human cognitive errors that often result in social biases. We propose metrics to assess factuality and fairness, and formally prove the inherent trade-off between these two aspects. Results show that GPT-4o and DALL-E 3 perform notably well among six LLMs and four T2I models. Our code is publicly available at https://github.com/uclanlp/Fact-or-Fair.
Related papers
- Dataset Scale and Societal Consistency Mediate Facial Impression Bias in Vision-Language AI [17.101569078791492]
We study 43 CLIP vision-language models to determine whether they learn human-like facial impression biases.
We show for the first time that the the degree to which a bias is shared across a society predicts the degree to which it is reflected in a CLIP model.
arXiv Detail & Related papers (2024-08-04T08:26:58Z) - "Patriarchy Hurts Men Too." Does Your Model Agree? A Discussion on Fairness Assumptions [3.706222947143855]
In the context of group fairness, this approach often obscures implicit assumptions about how bias is introduced into the data.
We are assuming that the biasing process is a monotonic function of the fair scores, dependent solely on the sensitive attribute.
Either the behavior of the biasing process is more complex than mere monotonicity, which means we need to identify and reject our implicit assumptions.
arXiv Detail & Related papers (2024-08-01T07:06:30Z) - Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models [10.73340009530019]
This study addresses two such biases within Large Language Models (LLMs): representative bias and affinity bias.
We introduce two novel metrics to measure these biases: the Representative Bias Score (RBS) and the Affinity Bias Score (ABS)
Our analysis uncovers marked representative biases in prominent LLMs, with a preference for identities associated with being white, straight, and men.
Our investigation of affinity bias reveals distinctive evaluative patterns within each model, akin to bias fingerprints'
arXiv Detail & Related papers (2024-05-23T13:35:34Z) - Quantifying Bias in Text-to-Image Generative Models [49.60774626839712]
Bias in text-to-image (T2I) models can propagate unfair social representations and may be used to aggressively market ideas or push controversial agendas.
Existing T2I model bias evaluation methods only focus on social biases.
We propose an evaluation methodology to quantify general biases in T2I generative models, without any preconceived notions.
arXiv Detail & Related papers (2023-12-20T14:26:54Z) - Learning for Counterfactual Fairness from Observational Data [62.43249746968616]
Fairness-aware machine learning aims to eliminate biases of learning models against certain subgroups described by certain protected (sensitive) attributes such as race, gender, and age.
A prerequisite for existing methods to achieve counterfactual fairness is the prior human knowledge of the causal model for the data.
In this work, we address the problem of counterfactually fair prediction from observational data without given causal models by proposing a novel framework CLAIRE.
arXiv Detail & Related papers (2023-07-17T04:08:29Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - DualFair: Fair Representation Learning at Both Group and Individual
Levels via Contrastive Self-supervision [73.80009454050858]
This work presents a self-supervised model, called DualFair, that can debias sensitive attributes like gender and race from learned representations.
Our model jointly optimize for two fairness criteria - group fairness and counterfactual fairness.
arXiv Detail & Related papers (2023-03-15T07:13:54Z) - UnQovering Stereotyping Biases via Underspecified Questions [68.81749777034409]
We present UNQOVER, a framework to probe and quantify biases through underspecified questions.
We show that a naive use of model scores can lead to incorrect bias estimates due to two forms of reasoning errors.
We use this metric to analyze four important classes of stereotypes: gender, nationality, ethnicity, and religion.
arXiv Detail & Related papers (2020-10-06T01:49:52Z) - Grading video interviews with fairness considerations [1.7403133838762446]
We present a methodology to automatically derive social skills of candidates based on their video response to interview questions.
We develop two machine-learning models to predict social skills.
We analyze fairness by studying the errors of models by race and gender.
arXiv Detail & Related papers (2020-07-02T10:06:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.