Refusal as Silence: Gendered Disparities in Vision-Language Model Responses
- URL: http://arxiv.org/abs/2406.08222v3
- Date: Mon, 27 Oct 2025 02:13:48 GMT
- Title: Refusal as Silence: Gendered Disparities in Vision-Language Model Responses
- Authors: Sha Luo, Sang Jung Kim, Zening Duan, Kaiping Chen,
- Abstract summary: This study investigates refusal as a sociotechnical outcome through a counterfactual persona design.<n>We find that transgender and non-binary personas experience significantly higher refusal rates, even in non-harmful contexts.
- Score: 0.4199844472131921
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Refusal behavior by Large Language Models is increasingly visible in content moderation, yet little is known about how refusals vary by the identity of the user making the request. This study investigates refusal as a sociotechnical outcome through a counterfactual persona design that varies gender identity--including male, female, non-binary, and transgender personas--while keeping the classification task and visual input constant. Focusing on a vision-language model (GPT-4V), we examine how identity-based language cues influence refusal in binary gender classification tasks. We find that transgender and non-binary personas experience significantly higher refusal rates, even in non-harmful contexts. Our findings also provide methodological implications for equity audits and content analysis using LLMs. Our findings underscore the importance of modeling identity-driven disparities and caution against uncritical use of AI systems for content coding. This study advances algorithmic fairness by reframing refusal as a communicative act that may unevenly regulate epistemic access and participation.
Related papers
- Equal Access, Unequal Interaction: A Counterfactual Audit of LLM Fairness [0.8699280339422538]
We examine how large language models differ in tone, uncertainty, and linguistic framing across demographic identities after access is granted.<n>We observe systematic, model-specific disparities in interaction quality.<n>These results show that fairness disparities can persist at the interaction level even when access is equal.
arXiv Detail & Related papers (2026-02-03T00:05:38Z) - Measuring Social Bias in Vision-Language Models with Face-Only Counterfactuals from Real Photos [79.03150233804458]
Real-world images entangle race and gender with correlated factors such as background and clothing, obscuring attribution.<n>We propose a textbfface-only counterfactual evaluation paradigm<n>We generate counterfactual variants by editing only facial attributes related to race and gender, keeping all other visual factors fixed.
arXiv Detail & Related papers (2026-01-11T14:35:06Z) - Mind the Inclusivity Gap: Multilingual Gender-Neutral Translation Evaluation with mGeNTE [34.11872938329087]
Gender-neutral translation (GNT) represents a linguistic strategy towards fairer communication across languages.<n>We introduce mGeNTE, an expert-curated resource, and use it to conduct the first systematic multilingual evaluation of inclusive translation.<n>Experiments on en-es/de/it/el reveal that while models can recognize when neutrality is appropriate, they cannot consistently produce neutral translations.
arXiv Detail & Related papers (2025-01-16T09:35:15Z) - FaceSaliencyAug: Mitigating Geographic, Gender and Stereotypical Biases via Saliency-Based Data Augmentation [46.74201905814679]
We present an approach named FaceSaliencyAug aimed at addressing the gender bias in computer vision models.
We quantify dataset diversity using Image Similarity Score (ISS) across five datasets, including Flickr Faces HQ (FFHQ), WIKI, IMDB, Labelled Faces in the Wild (LFW), UTK Faces, and Diverse dataset.
Our experiments reveal a reduction in gender bias for both CNNs and ViTs, indicating the efficacy of our method in promoting fairness and inclusivity in computer vision models.
arXiv Detail & Related papers (2024-10-17T22:36:52Z) - Gender Bias Evaluation in Text-to-image Generation: A Survey [25.702257177921048]
We review recent work on gender bias evaluation in text-to-image generation.
We focus on the evaluation of recent popular models such as Stable Diffusion and DALL-E 2.
arXiv Detail & Related papers (2024-08-21T06:01:23Z) - Beyond Binary Gender: Evaluating Gender-Inclusive Machine Translation with Ambiguous Attitude Words [85.48043537327258]
Existing machine translation gender bias evaluations are primarily focused on male and female genders.
This study presents a benchmark AmbGIMT (Gender-Inclusive Machine Translation with Ambiguous attitude words)
We propose a novel process to evaluate gender bias based on the Emotional Attitude Score (EAS), which is used to quantify ambiguous attitude words.
arXiv Detail & Related papers (2024-07-23T08:13:51Z) - GenderBias-\emph{VL}: Benchmarking Gender Bias in Vision Language Models via Counterfactual Probing [72.0343083866144]
This paper introduces the GenderBias-emphVL benchmark to evaluate occupation-related gender bias in Large Vision-Language Models.
Using our benchmark, we extensively evaluate 15 commonly used open-source LVLMs and state-of-the-art commercial APIs.
Our findings reveal widespread gender biases in existing LVLMs.
arXiv Detail & Related papers (2024-06-30T05:55:15Z) - Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias [8.168722337906148]
This study investigates the presence of bias in harmful speech classification of gender-queer dialect online.
We introduce a novel dataset, QueerLex, based on 109 curated templates exemplifying non-derogatory uses of LGBTQ+ slurs.
We systematically evaluate the performance of five off-the-shelf language models in assessing the harm of these texts.
arXiv Detail & Related papers (2024-05-23T18:07:28Z) - Tokenization Matters: Navigating Data-Scarce Tokenization for Gender Inclusive Language Technologies [75.85462924188076]
Gender-inclusive NLP research has documented the harmful limitations of gender binary-centric large language models (LLM)
We find that misgendering is significantly influenced by Byte-Pair (BPE) tokenization.
We propose two techniques: (1) pronoun tokenization parity, a method to enforce consistent tokenization across gendered pronouns, and (2) utilizing pre-existing LLM pronoun knowledge to improve neopronoun proficiency.
arXiv Detail & Related papers (2023-12-19T01:28:46Z) - Sociodemographic Prompting is Not Yet an Effective Approach for Simulating Subjective Judgments with LLMs [13.744746481528711]
Large Language Models (LLMs) are widely used to simulate human responses across diverse contexts.<n>We evaluate nine popular LLMs on their ability to understand demographic differences in two subjective judgment tasks: politeness and offensiveness.<n>We find that in zero-shot settings, most models' predictions for both tasks align more closely with labels from White participants than those from Asian or Black participants.
arXiv Detail & Related papers (2023-11-16T10:02:24Z) - Sensitivity, Performance, Robustness: Deconstructing the Effect of
Sociodemographic Prompting [64.80538055623842]
sociodemographic prompting is a technique that steers the output of prompt-based models towards answers that humans with specific sociodemographic profiles would give.
We show that sociodemographic information affects model predictions and can be beneficial for improving zero-shot learning in subjective NLP tasks.
arXiv Detail & Related papers (2023-09-13T15:42:06Z) - VisoGender: A dataset for benchmarking gender bias in image-text pronoun
resolution [80.57383975987676]
VisoGender is a novel dataset for benchmarking gender bias in vision-language models.
We focus on occupation-related biases within a hegemonic system of binary gender, inspired by Winograd and Winogender schemas.
We benchmark several state-of-the-art vision-language models and find that they demonstrate bias in resolving binary gender in complex scenes.
arXiv Detail & Related papers (2023-06-21T17:59:51Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - "I'm fully who I am": Towards Centering Transgender and Non-Binary
Voices to Measure Biases in Open Language Generation [69.25368160338043]
Transgender and non-binary (TGNB) individuals disproportionately experience discrimination and exclusion from daily life.
We assess how the social reality surrounding experienced marginalization of TGNB persons contributes to and persists within Open Language Generation.
We introduce TANGO, a dataset of template-based real-world text curated from a TGNB-oriented community.
arXiv Detail & Related papers (2023-05-17T04:21:45Z) - Stable Bias: Analyzing Societal Representations in Diffusion Models [72.27121528451528]
We propose a new method for exploring the social biases in Text-to-Image (TTI) systems.
Our approach relies on characterizing the variation in generated images triggered by enumerating gender and ethnicity markers in the prompts.
We leverage this method to analyze images generated by 3 popular TTI systems and find that while all of their outputs show correlations with US labor demographics, they also consistently under-represent marginalized identities to different extents.
arXiv Detail & Related papers (2023-03-20T19:32:49Z) - Facial Expression Recognition using Squeeze and Excitation-powered Swin
Transformers [0.0]
We propose a framework that employs Swin Vision Transformers (SwinT) and squeeze and excitation block (SE) to address vision tasks.
Our focus was to create an efficient FER model based on SwinT architecture that can recognize facial emotions using minimal data.
We trained our model on a hybrid dataset and evaluated its performance on the AffectNet dataset, achieving an F1-score of 0.5420.
arXiv Detail & Related papers (2023-01-26T02:29:17Z) - Social Biases in Automatic Evaluation Metrics for NLG [53.76118154594404]
We propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics.
We construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks.
arXiv Detail & Related papers (2022-10-17T08:55:26Z) - Are Commercial Face Detection Models as Biased as Academic Models? [64.71318433419636]
We compare academic and commercial face detection systems, specifically examining robustness to noise.
We find that state-of-the-art academic face detection models exhibit demographic disparities in their noise robustness.
We conclude that commercial models are always as biased or more biased than an academic model.
arXiv Detail & Related papers (2022-01-25T02:21:42Z) - Harms of Gender Exclusivity and Challenges in Non-Binary Representation
in Language Technologies [30.096268927587214]
We explain the complexity of gender and language around it.
We survey non-binary persons to understand harms associated with the treatment of gender as binary.
arXiv Detail & Related papers (2021-08-27T01:58:58Z) - Automatic Expansion of Domain-Specific Affective Models for Web
Intelligence Applications [3.0012517171007755]
Sentic computing relies on well-defined affective models of different complexity.
The most granular affective model combined with sophisticated machine learning approaches may not fully capture an organisation's strategic positioning goals.
This article introduces expansion techniques for affective models, combining common and commonsense knowledge available in knowledge graphs with language models and affective reasoning.
arXiv Detail & Related papers (2021-02-01T13:32:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.