BIGbench: A Unified Benchmark for Social Bias in Text-to-Image Generative Models Based on Multi-modal LLM
- URL: http://arxiv.org/abs/2407.15240v3
- Date: Fri, 16 Aug 2024 05:53:16 GMT
- Title: BIGbench: A Unified Benchmark for Social Bias in Text-to-Image Generative Models Based on Multi-modal LLM
- Authors: Hanjun Luo, Haoyu Huang, Ziye Deng, Xuecheng Liu, Ruizhe Chen, Zuozhu Liu,
- Abstract summary: We introduce BIGbench, a unified benchmark for Biases of Image Generation.
Unlike existing benchmarks, BIGbench classifies and evaluates biases across four dimensions.
Our study also reveal new research directions about biases, such as the effect of distillation and irrelevant protected attributes.
- Score: 8.24274551090375
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-Image (T2I) generative models are becoming increasingly crucial due to their ability to generate high-quality images, which also raises concerns about the social biases in their outputs, especially in the human generation. Sociological research has established systematic classifications of bias. However, existing bias research about T2I models conflates different types of bias, impeding methodological progress. In this paper, we introduce BIGbench, a unified benchmark for Biases of Image Generation, featuring a meticulously designed dataset. Unlike existing benchmarks, BIGbench classifies and evaluates biases across four dimensions: manifestation of bias, visibility of bias, acquired attributes, and protected attributes, which ensures exceptional accuracy for analysis. Furthermore, BIGbench applies advanced multi-modal large language models to achieve fully automated and highly accurate evaluations. We apply BIGbench to evaluate eight representative general T2I models and three debiased methods. Our human evaluation results underscore BIGbench's effectiveness in aligning images and identifying various biases. Besides, our study also reveal new research directions about biases, such as the effect of distillation and irrelevant protected attributes. Our benchmark is openly accessible at https://github.com/BIGbench2024/BIGbench2024/ to ensure reproducibility.
Related papers
- GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models [75.04426753720553]
We propose a framework to identify, quantify, and explain biases in an open set setting.
This pipeline leverages a Large Language Model (LLM) to propose biases starting from a set of captions.
We show two variations of this framework: OpenBias and GradBias.
arXiv Detail & Related papers (2024-08-29T16:51:07Z) - VLBiasBench: A Comprehensive Benchmark for Evaluating Bias in Large Vision-Language Model [72.13121434085116]
VLBiasBench is a benchmark aimed at evaluating biases in Large Vision-Language Models (LVLMs)
We construct a dataset encompassing nine distinct categories of social biases, including age, disability status, gender, nationality, physical appearance, race, religion, profession, social economic status and two intersectional bias categories (race x gender, and race x social economic status)
We conduct extensive evaluations on 15 open-source models as well as one advanced closed-source model, providing some new insights into the biases revealing from these models.
arXiv Detail & Related papers (2024-06-20T10:56:59Z) - FAIntbench: A Holistic and Precise Benchmark for Bias Evaluation in Text-to-Image Models [7.30796695035169]
FAIntbench is a holistic and precise benchmark for biases in Text-to-Image (T2I) models.
We applied FAIntbench to evaluate seven recent large-scale T2I models and conducted human evaluation.
Results demonstrated the effectiveness of FAIntbench in identifying various biases.
arXiv Detail & Related papers (2024-05-28T04:18:00Z) - Survey of Bias In Text-to-Image Generation: Definition, Evaluation, and Mitigation [47.770531682802314]
Even simple prompts could cause T2I models to exhibit conspicuous social bias in generated images.
We present the first extensive survey on bias in T2I generative models.
We discuss how these works define, evaluate, and mitigate different aspects of bias.
arXiv Detail & Related papers (2024-04-01T10:19:05Z) - Classes Are Not Equal: An Empirical Study on Image Recognition Fairness [100.36114135663836]
We experimentally demonstrate that classes are not equal and the fairness issue is prevalent for image classification models across various datasets.
Our findings reveal that models tend to exhibit greater prediction biases for classes that are more challenging to recognize.
Data augmentation and representation learning algorithms improve overall performance by promoting fairness to some degree in image classification.
arXiv Detail & Related papers (2024-02-28T07:54:50Z) - Quantifying Bias in Text-to-Image Generative Models [49.60774626839712]
Bias in text-to-image (T2I) models can propagate unfair social representations and may be used to aggressively market ideas or push controversial agendas.
Existing T2I model bias evaluation methods only focus on social biases.
We propose an evaluation methodology to quantify general biases in T2I generative models, without any preconceived notions.
arXiv Detail & Related papers (2023-12-20T14:26:54Z) - TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models [22.076898042211305]
We propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt.
Our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases.
We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts.
arXiv Detail & Related papers (2023-12-03T02:31:37Z) - Unravelling the Effect of Image Distortions for Biased Prediction of
Pre-trained Face Recognition Models [86.79402670904338]
We evaluate the performance of four state-of-the-art deep face recognition models in the presence of image distortions.
We have observed that image distortions have a relationship with the performance gap of the model across different subgroups.
arXiv Detail & Related papers (2021-08-14T16:49:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.