Related papers: Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English

Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English

URL: http://arxiv.org/abs/2511.10846v1
Date: Thu, 13 Nov 2025 23:13:08 GMT
Title: Reinforcing Stereotypes of Anger: Emotion AI on African American Vernacular English
Authors: Rebecca Dorn, Christina Chance, Casandra Rusti, Charles Bickham, Kai-Wei Chang, Fred Morstatter, Kristina Lerman,
Abstract summary: This study examines emotion recognition model performance on African American Vernacular English (AAVE) compared to General American English (GAE)<n>We analyze 2.7 million tweets geo-tagged within Los Angeles.<n>We observe that neighborhoods with higher proportions of African American residents are associated with higher predictions of anger.
Score: 46.47177439553625
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automated emotion detection is widely used in applications ranging from well-being monitoring to high-stakes domains like mental health and hiring. However, models often rely on annotations that reflect dominant cultural norms, limiting model ability to recognize emotional expression in dialects often excluded from training data distributions, such as African American Vernacular English (AAVE). This study examines emotion recognition model performance on AAVE compared to General American English (GAE). We analyze 2.7 million tweets geo-tagged within Los Angeles. Texts are scored for strength of AAVE using computational approximations of dialect features. Annotations of emotion presence and intensity are collected on a dataset of 875 tweets with both high and low AAVE densities. To assess model accuracy on a task as subjective as emotion perception, we calculate community-informed "silver" labels where AAVE-dense tweets are labeled by African American, AAVE-fluent (ingroup) annotators. On our labeled sample, GPT and BERT-based models exhibit false positive prediction rates of anger on AAVE more than double than on GAE. SpanEmo, a popular text-based emotion model, increases false positive rates of anger from 25 percent on GAE to 60 percent on AAVE. Additionally, a series of linear regressions reveals that models and non-ingroup annotations are significantly more correlated with profanity-based AAVE features than ingroup annotations. Linking Census tract demographics, we observe that neighborhoods with higher proportions of African American residents are associated with higher predictions of anger (Pearson's correlation r = 0.27) and lower joy (r = -0.10). These results find an emergent safety issue of emotion AI reinforcing racial stereotypes through biased emotion classification. We emphasize the need for culturally and dialect-informed affective computing systems.

Related papers

Evaluating the Usage of African-American Vernacular English in Large Language Models [5.242425502046959]
We investigate how accurately large language models (LLMs) represent African American Vernacular English (AAVE)<n>We compare their usage of AAVE to the usage of humans who native speak AAVE.<n>We find that, in many cases, there are substantial differences between AAVE usage in LLMs and humans.
arXiv Detail & Related papers (2026-02-25T01:28:01Z)
Emo Pillars: Knowledge Distillation to Support Fine-Grained Context-Aware and Context-Less Emotion Classification [56.974545305472304]
Most datasets for sentiment analysis lack context in which an opinion was expressed, often crucial for emotion understanding, and are mainly limited by a few emotion categories.<n>We design an LLM-based data synthesis pipeline and leverage a large model, Mistral-7b, for the generation of training examples for more accessible, lightweight BERT-type encoder models.<n>We show that Emo Pillars models are highly adaptive to new domains when tuned to specific tasks such as GoEmotions, ISEAR, IEMOCAP, and EmoContext, reaching the SOTA performance on the first three.
arXiv Detail & Related papers (2025-04-23T16:23:17Z)
A Study of Nationality Bias in Names and Perplexity using Off-the-Shelf Affect-related Tweet Classifiers [0.0]
We create counterfactual examples with small perturbations on target-domain data instead of relying on templates or specific datasets for bias detection. On widely used classifiers for subjectivity analysis, including sentiment, emotion, hate speech, our results demonstrate positive biases related to the language spoken in a country.
arXiv Detail & Related papers (2024-07-01T22:17:17Z)
The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models [78.69526166193236]
Pre-trained Language models (PLMs) have been acknowledged to contain harmful information, such as social biases. We propose sc Social Bias Neurons to accurately pinpoint units (i.e., neurons) in a language model that can be attributed to undesirable behavior, such as social bias. As measured by prior metrics from StereoSet, our model achieves a higher degree of fairness while maintaining language modeling ability with low cost.
arXiv Detail & Related papers (2024-06-14T15:41:06Z)
A Comprehensive View of the Biases of Toxicity and Sentiment Analysis Methods Towards Utterances with African American English Expressions [5.472714002128254]
We study bias on two Web-based (YouTube and Twitter) datasets and two spoken English datasets. We isolate the impact of AAE expression usage via linguistic control features from the Linguistic Inquiry and Word Count software. We present consistent results on how a heavy usage of AAE expressions may cause the speaker to be considered substantially more toxic, even when speaking about nearly the same subject.
arXiv Detail & Related papers (2024-01-23T12:41:03Z)
Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs [67.51906565969227]
We study the unintended side-effects of persona assignment on the ability of LLMs to perform basic reasoning tasks. Our study covers 24 reasoning datasets, 4 LLMs, and 19 diverse personas (e.g. an Asian person) spanning 5 socio-demographic groups.
arXiv Detail & Related papers (2023-11-08T18:52:17Z)
Regional Negative Bias in Word Embeddings Predicts Racial Animus--but only via Name Frequency [2.247786323899963]
We show that anti-black WEAT estimates from geo-tagged social media data strongly correlate with several measures of racial animus. We also show that every one of these correlations is explained by the frequency of Black names in the underlying corpora relative to White names.
arXiv Detail & Related papers (2022-01-20T20:52:12Z)
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations. We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity. Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z)
Investigating African-American Vernacular English in Transformer-Based Text Generation [55.53547556060537]
Social media has encouraged the written use of African American Vernacular English (AAVE) We investigate the performance of GPT-2 on AAVE text by creating a dataset of intent-equivalent parallel AAVE/SAE tweet pairs. We find that while AAVE text results in more classifications of negative sentiment than SAE, the use of GPT-2 generally increases occurrences of positive sentiment for both.
arXiv Detail & Related papers (2020-10-06T06:27:02Z)
Intersectional Bias in Hate Speech and Abusive Language Datasets [0.3149883354098941]
African American tweets were up to 3.7 times more likely to be labeled as abusive. African American male tweets were up to 77% more likely to be labeled as hateful. This study provides the first systematic evidence on intersectional bias in datasets of hate speech and abusive language.
arXiv Detail & Related papers (2020-05-12T16:58:48Z)
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations [68.16751625956243]
Only perfect Standard English corpora predisposes neural networks to discriminate against minorities from non-standard linguistic backgrounds. We perturb the inflectional morphology of words to craft plausible and semantically similar adversarial examples.
arXiv Detail & Related papers (2020-05-09T04:01:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.