Related papers: Uncovering an Attractiveness Bias in Multimodal Large Language Models: A Case Study with LLaVA

Uncovering an Attractiveness Bias in Multimodal Large Language Models: A Case Study with LLaVA

URL: http://arxiv.org/abs/2504.16104v1
Date: Wed, 16 Apr 2025 16:02:55 GMT
Title: Uncovering an Attractiveness Bias in Multimodal Large Language Models: A Case Study with LLaVA
Authors: Aditya Gulati, Moreno D'Incà, Nicu Sebe, Bruno Lepri, Nuria Oliver,
Abstract summary: We study the role that attractiveness plays in the assessments and decisions made by multimodal large language models (MLLMs)<n>Our analysis reveals that attractiveness impacts the decisions made by the MLLM in over 80% of the scenarios.<n>We uncover a gender, age and race bias in 83%, 73% and 57% of the scenarios, respectively.
Score: 51.590283139444814
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Physical attractiveness matters. It has been shown to influence human perception and decision-making, often leading to biased judgments that favor those deemed attractive in what is referred to as "the attractiveness halo effect". While extensively studied in human judgments in a broad set of domains, including hiring, judicial sentencing or credit granting, the role that attractiveness plays in the assessments and decisions made by multimodal large language models (MLLMs) is unknown. To address this gap, we conduct an empirical study using 91 socially relevant scenarios and a diverse dataset of 924 face images, corresponding to 462 individuals both with and without beauty filters applied to them, evaluated on LLaVA, a state-of-the-art, open source MLLM. Our analysis reveals that attractiveness impacts the decisions made by the MLLM in over 80% of the scenarios, demonstrating substantial bias in model behavior in what we refer to as an attractiveness bias. Similarly to humans, we find empirical evidence of the existence of the attractiveness halo effect, such that more attractive individuals are more likely to be attributed positive traits, such as intelligence or confidence, by the MLLM. Furthermore, we uncover a gender, age and race bias in 83%, 73% and 57% of the scenarios, respectively, which is impacted by attractiveness, particularly in the case of gender, highlighting the intersectional nature of the attractiveness bias. Our findings suggest that societal stereotypes and cultural norms intersect with perceptions of attractiveness, amplifying or mitigating this bias in multimodal generative AI models in a complex way. Our work emphasizes the need to account for intersectionality in algorithmic bias detection and mitigation efforts and underscores the challenges of addressing bias in modern multimodal large language models.

Related papers

From Individuals to Interactions: Benchmarking Gender Bias in Multimodal Large Language Models from the Lens of Social Relationship [13.416624729344477]
We introduce Genres, a novel benchmark designed to evaluate gender bias in MLLMs through the lens of social in relationships generated narratives.<n>Our findings underscore the importance of relationship-aware benchmarks for diagnosing subtle, interaction-driven gender bias in MLLMs.
arXiv Detail & Related papers (2025-06-29T06:03:21Z)
The LLM Wears Prada: Analysing Gender Bias and Stereotypes through Online Shopping Data [8.26034886618475]
We investigate whether Large Language Models can predict an individual's gender based solely on online shopping histories.<n>Using a dataset of historical online purchases from users in the United States, we evaluate the ability of six LLMs to classify gender.<n>Results indicate that while models can infer gender with moderate accuracy, their decisions are often rooted in stereotypical associations between product categories and gender.
arXiv Detail & Related papers (2025-04-02T17:56:08Z)
Popular LLMs Amplify Race and Gender Disparities in Human Mobility [2.601262068492271]
This study investigates whether large language models (LLMs) exhibit biases in predicting human mobility based on race and gender. We find that LLMs frequently reflect and amplify existing societal biases.
arXiv Detail & Related papers (2024-11-18T19:41:20Z)
The Root Shapes the Fruit: On the Persistence of Gender-Exclusive Harms in Aligned Language Models [58.130894823145205]
We center transgender, nonbinary, and other gender-diverse identities to investigate how alignment procedures interact with pre-existing gender-diverse bias. Our findings reveal that DPO-aligned models are particularly sensitive to supervised finetuning. We conclude with recommendations tailored to DPO and broader alignment practices.
arXiv Detail & Related papers (2024-11-06T06:50:50Z)
Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs) [82.57490175399693]
We study gender bias in 22 popular image-to-text vision-language assistants (VLAs)<n>Our results show that VLAs replicate human biases likely present in the data, such as real-world occupational imbalances.<n>To eliminate the gender bias in these models, we find that fine-tuning-based debiasing methods achieve the best trade-off between debiasing and retaining performance.
arXiv Detail & Related papers (2024-10-25T05:59:44Z)
Biased AI can Influence Political Decision-Making [64.9461133083473]
This paper presents two experiments investigating the effects of partisan bias in large language models (LLMs) on political opinions and decision-making.<n>We found that participants exposed to partisan biased models were significantly more likely to adopt opinions and make decisions which matched the LLM's bias.
arXiv Detail & Related papers (2024-10-08T22:56:00Z)
Gender Biases in LLMs: Higher intelligence in LLM does not necessarily solve gender bias and stereotyping [0.0]
Large Language Models (LLMs) are finding applications in all aspects of life, but their susceptibility to biases, particularly gender stereotyping, raises ethical concerns.<n>This study introduces a novel methodology, a persona-based framework, and a unisex name methodology to investigate whether higher-intelligence LLMs reduce such biases.
arXiv Detail & Related papers (2024-09-30T05:22:54Z)
GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models [73.23743278545321]
Large language models (LLMs) have exhibited remarkable capabilities in natural language generation, but have also been observed to magnify societal biases.<n>GenderCARE is a comprehensive framework that encompasses innovative Criteria, bias Assessment, Reduction techniques, and Evaluation metrics.
arXiv Detail & Related papers (2024-08-22T15:35:46Z)
GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models. Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs. Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z)
Locating and Mitigating Gender Bias in Large Language Models [40.78150878350479]
Large language models (LLM) are pre-trained on extensive corpora to learn facts and human cognition which contain human preferences. This process can inadvertently lead to these models acquiring biases and prevalent stereotypes in society. We propose the LSDM (Least Square Debias Method), a knowledge-editing based method for mitigating gender bias in occupational pronouns.
arXiv Detail & Related papers (2024-03-21T13:57:43Z)
Investigating Bias Representations in Llama 2 Chat via Activation Steering [0.0]
We use activation steering to probe for and mitigate biases related to gender, race, and religion. Our findings reveal inherent gender bias in Llama 2 7B Chat, persisting even after Reinforcement Learning from Human Feedback. This work also provides valuable insights into effective red-teaming strategies for Large Language Models.
arXiv Detail & Related papers (2024-02-01T07:48:50Z)
Exploring the Jungle of Bias: Political Bias Attribution in Language Models via Dependency Analysis [86.49858739347412]
Large Language Models (LLMs) have sparked intense debate regarding the prevalence of bias in these models and its mitigation. We propose a prompt-based method for the extraction of confounding and mediating attributes which contribute to the decision process. We find that the observed disparate treatment can at least in part be attributed to confounding and mitigating attributes and model misalignment.
arXiv Detail & Related papers (2023-11-15T00:02:25Z)
Identifying and examining machine learning biases on Adult dataset [0.7856362837294112]
This research delves into the reduction of machine learning model bias through Ensemble Learning. Our rigorous methodology comprehensively assesses bias across various categorical variables, ultimately revealing a pronounced gender attribute bias. This study underscores ethical considerations and advocates the implementation of hybrid models for a data-driven society marked by inclusivity and impartiality.
arXiv Detail & Related papers (2023-10-13T19:41:47Z)
Investigating Subtler Biases in LLMs: Ageism, Beauty, Institutional, and Nationality Bias in Generative Models [0.0]
This paper investigates bias along less-studied but still consequential, dimensions, such as age and beauty. We ask whether LLMs hold wide-reaching biases of positive or negative sentiment for specific social groups similar to the "what is beautiful is good" bias found in people in experimental psychology.
arXiv Detail & Related papers (2023-09-16T07:07:04Z)
The Unequal Opportunities of Large Language Models: Revealing Demographic Bias through Job Recommendations [5.898806397015801]
We propose a simple method for analyzing and comparing demographic bias in Large Language Models (LLMs) We demonstrate the effectiveness of our method by measuring intersectional biases within ChatGPT and LLaMA. We identify distinct biases in both models toward various demographic identities, such as both models consistently suggesting low-paying jobs for Mexican workers.
arXiv Detail & Related papers (2023-08-03T21:12:54Z)
Are Commercial Face Detection Models as Biased as Academic Models? [64.71318433419636]
We compare academic and commercial face detection systems, specifically examining robustness to noise. We find that state-of-the-art academic face detection models exhibit demographic disparities in their noise robustness. We conclude that commercial models are always as biased or more biased than an academic model.
arXiv Detail & Related papers (2022-01-25T02:21:42Z)
Towards Understanding and Mitigating Social Biases in Language Models [107.82654101403264]
Large-scale pretrained language models (LMs) can be potentially dangerous in manifesting undesirable representational biases. We propose steps towards mitigating social biases during text generation. Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information.
arXiv Detail & Related papers (2021-06-24T17:52:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.