Intersectional Bias in Hate Speech and Abusive Language Datasets
- URL: http://arxiv.org/abs/2005.05921v3
- Date: Thu, 28 May 2020 05:49:19 GMT
- Title: Intersectional Bias in Hate Speech and Abusive Language Datasets
- Authors: Jae Yeon Kim, Carlos Ortiz, Sarah Nam, Sarah Santiago, Vivek Datta
- Abstract summary: African American tweets were up to 3.7 times more likely to be labeled as abusive.
African American male tweets were up to 77% more likely to be labeled as hateful.
This study provides the first systematic evidence on intersectional bias in datasets of hate speech and abusive language.
- Score: 0.3149883354098941
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Algorithms are widely applied to detect hate speech and abusive language in
social media. We investigated whether the human-annotated data used to train
these algorithms are biased. We utilized a publicly available annotated Twitter
dataset (Founta et al. 2018) and classified the racial, gender, and party
identification dimensions of 99,996 tweets. The results showed that African
American tweets were up to 3.7 times more likely to be labeled as abusive, and
African American male tweets were up to 77% more likely to be labeled as
hateful compared to the others. These patterns were statistically significant
and robust even when party identification was added as a control variable. This
study provides the first systematic evidence on intersectional bias in datasets
of hate speech and abusive language.
Related papers
- On the Use of Proxies in Political Ad Targeting [49.61009579554272]
We show that major political advertisers circumvented mitigations by targeting proxy attributes.
Our findings have crucial implications for the ongoing discussion on the regulation of political advertising.
arXiv Detail & Related papers (2024-10-18T17:15:13Z) - Analysis and Detection of Multilingual Hate Speech Using Transformer
Based Deep Learning [7.332311991395427]
As the prevalence of hate speech increases online, the demand for automated detection as an NLP task is increasing.
In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc.
The Gold standard datasets were collected from renowned researcher Zeerak Talat, Sara Tonelli, Melanie Siegel, and Rezaul Karim.
The success rate of the proposed model for hate speech detection is higher than the existing baseline and state-of-the-art models with accuracy in Bengali dataset is 89%, in English: 91%, in German
arXiv Detail & Related papers (2024-01-19T20:40:23Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes [1.5039745292757671]
We investigate bias in hate speech datasets along racial, gender and intersectional axes.
We identify strong bias against African American English (AAE), masculine and AAE+Masculine tweets.
arXiv Detail & Related papers (2022-05-13T13:13:46Z) - Manipulating Twitter Through Deletions [64.33261764633504]
Research into influence campaigns on Twitter has mostly relied on identifying malicious activities from tweets obtained via public APIs.
Here, we provide the first exhaustive, large-scale analysis of anomalous deletion patterns involving more than a billion deletions by over 11 million accounts.
We find that a small fraction of accounts delete a large number of tweets daily.
First, limits on tweet volume are circumvented, allowing certain accounts to flood the network with over 26 thousand daily tweets.
Second, coordinated networks of accounts engage in repetitive likes and unlikes of content that is eventually deleted, which can manipulate ranking algorithms.
arXiv Detail & Related papers (2022-03-25T20:07:08Z) - Emojis as Anchors to Detect Arabic Offensive Language and Hate Speech [6.1875341699258595]
We introduce a generic, language-independent method to collect a large percentage of offensive and hate tweets.
We harness the extralinguistic information embedded in the emojis to collect a large number of offensive tweets.
arXiv Detail & Related papers (2022-01-18T03:56:57Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Probabilistic Impact Score Generation using Ktrain-BERT to Identify Hate
Words from Twitter Discussions [0.5735035463793008]
This paper presents experimentation with a Keras wrapped lightweight BERT model to successfully identify hate speech.
The dataset used for this task is the Hate Speech and Offensive Content Detection (HASOC 2021) data from FIRE 2021 in English.
Our system obtained a validation accuracy of 82.60%, with a maximum F1-Score of 82.68%.
arXiv Detail & Related papers (2021-11-25T06:35:49Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - Detecting White Supremacist Hate Speech using Domain Specific Word
Embedding with Deep Learning and BERT [0.0]
White supremacist hate speech is one of the most recently observed harmful content on social media.
This research investigates the viability of automatically detecting white supremacist hate speech on Twitter by using deep learning and natural language processing techniques.
arXiv Detail & Related papers (2020-10-01T12:44:24Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.