Evidence for Hypodescent in Visual Semantic AI
- URL: http://arxiv.org/abs/2205.10764v1
- Date: Sun, 22 May 2022 06:46:39 GMT
- Title: Evidence for Hypodescent in Visual Semantic AI
- Authors: Robert Wolfe, Mahzarin R. Banaji, Aylin Caliskan
- Abstract summary: Multiracial people are more likely to be assigned a racial or ethnic label corresponding to a minority or disadvantaged racial or ethnic group than to the equivalent majority or advantaged group.
A face morphing experiment grounded in psychological research demonstrates hypodescent.
We show that the stereotype-congruent pleasantness association of an image correlates with association with the Black text label in CLIP.
- Score: 4.1804353242318255
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We examine the state-of-the-art multimodal "visual semantic" model CLIP
("Contrastive Language Image Pretraining") for the rule of hypodescent, or
one-drop rule, whereby multiracial people are more likely to be assigned a
racial or ethnic label corresponding to a minority or disadvantaged racial or
ethnic group than to the equivalent majority or advantaged group. A face
morphing experiment grounded in psychological research demonstrating
hypodescent indicates that, at the midway point of 1,000 series of morphed
images, CLIP associates 69.7% of Black-White female images with a Black text
label over a White text label, and similarly prefers Latina (75.8%) and Asian
(89.1%) text labels at the midway point for Latina-White female and Asian-White
female morphs, reflecting hypodescent. Additionally, assessment of the
underlying cosine similarities in the model reveals that association with White
is correlated with association with "person," with Pearson's rho as high as
0.82 over a 21,000-image morph series, indicating that a White person
corresponds to the default representation of a person in CLIP. Finally, we show
that the stereotype-congruent pleasantness association of an image correlates
with association with the Black text label in CLIP, with Pearson's rho = 0.48
for 21,000 Black-White multiracial male images, and rho = 0.41 for Black-White
multiracial female images. CLIP is trained on English-language text gathered
using data collected from an American website (Wikipedia), and our findings
demonstrate that CLIP embeds the values of American racial hierarchy,
reflecting the implicit and explicit beliefs that are present in human minds.
We contextualize these findings within the history and psychology of
hypodescent. Overall, the data suggests that AI supervised using natural
language will, unless checked, learn biases that reflect racial hierarchies.
Related papers
- AI-generated faces influence gender stereotypes and racial homogenization [1.6647208383676708]
We document significant biases in Stable Diffusion across six races, two genders, 32 professions, and eight attributes.
This analysis reveals significant racial homogenization depicting nearly all Middle Eastern men as bearded, brown-skinned, and wearing traditional attire.
We propose debiasing solutions that allow users to specify the desired distributions of race and gender when generating images.
arXiv Detail & Related papers (2024-02-01T20:32:14Z) - What Do Llamas Really Think? Revealing Preference Biases in Language
Model Representations [62.91799637259657]
Do large language models (LLMs) exhibit sociodemographic biases, even when they decline to respond?
We study this research question by probing contextualized embeddings and exploring whether this bias is encoded in its latent representations.
We propose a logistic Bradley-Terry probe which predicts word pair preferences of LLMs from the words' hidden vectors.
arXiv Detail & Related papers (2023-11-30T18:53:13Z) - How well can Text-to-Image Generative Models understand Ethical Natural
Language Interventions? [67.97752431429865]
We study the effect on the diversity of the generated images when adding ethical intervention.
Preliminary studies indicate that a large change in the model predictions is triggered by certain phrases such as 'irrespective of gender'
arXiv Detail & Related papers (2022-10-27T07:32:39Z) - Studying Bias in GANs through the Lens of Race [91.95264864405493]
We study how the performance and evaluation of generative image models are impacted by the racial composition of their training datasets.
Our results show that the racial compositions of generated images successfully preserve that of the training data.
However, we observe that truncation, a technique used to generate higher quality images during inference, exacerbates racial imbalances in the data.
arXiv Detail & Related papers (2022-09-06T22:25:56Z) - American == White in Multimodal Language-and-Image AI [3.4157048274143316]
Three state-of-the-art language-and-image AI models are evaluated.
We show that White individuals are more associated with collective in-group words than are Asian, Black, or Latina/o individuals.
The results indicate that biases equating American identity with being White are learned by language-and-image AI.
arXiv Detail & Related papers (2022-07-01T23:45:56Z) - Markedness in Visual Semantic AI [3.4157048274143316]
We evaluate the state-of-the-art multimodal "visual semantic" model CLIP for biases related to the marking of age, gender, and race or ethnicity.
Female individuals under the age of 20 are more likely than Male individuals to be marked with a gender label, but less likely to be marked with an age label.
As age increases, the self-similarity of representations of Female individuals increases at a higher rate than for Male individuals.
arXiv Detail & Related papers (2022-05-23T15:14:41Z) - Regional Negative Bias in Word Embeddings Predicts Racial Animus--but
only via Name Frequency [2.247786323899963]
We show that anti-black WEAT estimates from geo-tagged social media data strongly correlate with several measures of racial animus.
We also show that every one of these correlations is explained by the frequency of Black names in the underlying corpora relative to White names.
arXiv Detail & Related papers (2022-01-20T20:52:12Z) - Black or White but never neutral: How readers perceive identity from
yellow or skin-toned emoji [90.14874935843544]
Recent work established a connection between expression of identity and emoji usage on social media.
This work asks if, as with language, readers are sensitive to such acts of self-expression and use them to understand the identity of authors.
arXiv Detail & Related papers (2021-05-12T18:23:51Z) - One Label, One Billion Faces: Usage and Consistency of Racial Categories
in Computer Vision [75.82110684355979]
We study the racial system encoded by computer vision datasets supplying categorical race labels for face images.
We find that each dataset encodes a substantially unique racial system, despite nominally equivalent racial categories.
We find evidence that racial categories encode stereotypes, and exclude ethnic groups from categories on the basis of nonconformity to stereotypes.
arXiv Detail & Related papers (2021-02-03T22:50:04Z) - Unequal Representations: Analyzing Intersectional Biases in Word
Embeddings Using Representational Similarity Analysis [3.8580784887142774]
We probe contextualized and non-contextualized embeddings for evidence of intersectional biases against Black women.
We show that these embeddings represent Black women as simultaneously less feminine than White women, and less Black than Black men.
arXiv Detail & Related papers (2020-11-24T13:45:14Z) - It's Morphin' Time! Combating Linguistic Discrimination with
Inflectional Perturbations [68.16751625956243]
Only perfect Standard English corpora predisposes neural networks to discriminate against minorities from non-standard linguistic backgrounds.
We perturb the inflectional morphology of words to craft plausible and semantically similar adversarial examples.
arXiv Detail & Related papers (2020-05-09T04:01:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.