Related papers: Detecting White Supremacist Hate Speech using Domain Specific Word Embedding with Deep Learning and BERT

Detecting White Supremacist Hate Speech using Domain Specific Word Embedding with Deep Learning and BERT

URL: http://arxiv.org/abs/2010.00357v1
Date: Thu, 1 Oct 2020 12:44:24 GMT
Title: Detecting White Supremacist Hate Speech using Domain Specific Word Embedding with Deep Learning and BERT
Authors: Hind Saleh Alatawi, Areej Maatog Alhothali and Kawthar Mustafa Moria
Abstract summary: White supremacist hate speech is one of the most recently observed harmful content on social media. This research investigates the viability of automatically detecting white supremacist hate speech on Twitter by using deep learning and natural language processing techniques.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: White supremacists embrace a radical ideology that considers white people superior to people of other races. The critical influence of these groups is no longer limited to social media; they also have a significant effect on society in many ways by promoting racial hatred and violence. White supremacist hate speech is one of the most recently observed harmful content on social media.Traditional channels of reporting hate speech have proved inadequate due to the tremendous explosion of information, and therefore, it is necessary to find an automatic way to detect such speech in a timely manner. This research investigates the viability of automatically detecting white supremacist hate speech on Twitter by using deep learning and natural language processing techniques. Through our experiments, we used two approaches, the first approach is by using domain-specific embeddings which are extracted from white supremacist corpus in order to catch the meaning of this white supremacist slang with bidirectional Long Short-Term Memory (LSTM) deep learning model, this approach reached a 0.74890 F1-score. The second approach is by using the one of the most recent language model which is BERT, BERT model provides the state of the art of most NLP tasks. It reached to a 0.79605 F1-score. Both approaches are tested on a balanced dataset given that our experiments were based on textual data only. The dataset was combined from dataset created from Twitter and a Stormfront dataset compiled from that white supremacist forum.

Related papers

A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities [43.37824420609252]
Hate speech online remains an understudied issue for marginalized communities. In this paper, we aim to provide marginalized communities with a privacy-preserving tool to protect themselves from online hate speech.
arXiv Detail & Related papers (2024-12-06T11:00:05Z)
Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles [47.61526125774749]
A dog whistle is a form of coded communication that carries a secondary meaning to specific audiences and is often weaponized for racial and socioeconomic discrimination. We present an approach for word-sense disambiguation of dog whistles from standard speech using Large Language Models (LLMs) We leverage this technique to create a dataset of 16,550 high-confidence coded examples of dog whistles used in formal and informal communication.
arXiv Detail & Related papers (2024-06-10T23:09:19Z)
Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning [7.332311991395427]
As the prevalence of hate speech increases online, the demand for automated detection as an NLP task is increasing. In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc. The Gold standard datasets were collected from renowned researcher Zeerak Talat, Sara Tonelli, Melanie Siegel, and Rezaul Karim. The success rate of the proposed model for hate speech detection is higher than the existing baseline and state-of-the-art models with accuracy in Bengali dataset is 89%, in English: 91%, in German
arXiv Detail & Related papers (2024-01-19T20:40:23Z)
An Investigation of Large Language Models for Real-World Hate Speech Detection [46.15140831710683]
A major limitation of existing methods is that hate speech detection is a highly contextual problem. Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks. Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech.
arXiv Detail & Related papers (2024-01-07T00:39:33Z)
A Weakly Supervised Classifier and Dataset of White Supremacist Language [6.893512627479197]
We present a dataset and classifier for detecting the language of white supremacist extremism. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data.
arXiv Detail & Related papers (2023-06-27T18:19:32Z)
From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language Models [73.25963871034858]
We present the first large-scale computational investigation of dogwhistles. We develop a typology of dogwhistles, curate the largest-to-date glossary of over 300 dogwhistles, and analyze their usage in historical U.S. politicians' speeches. We show that harmful content containing dogwhistles avoids toxicity detection, highlighting online risks of such coded language.
arXiv Detail & Related papers (2023-05-26T18:00:57Z)
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations. We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model [0.5801044612920815]
This paper investigates the feasibility of leveraging domain-specific word embedding in Bidirectional LSTM based deep model to automatically detect/classify hate speech. The experiments showed that domainspecific word embedding with the Bidirectional LSTM based deep model achieved a 93% f1-score while BERT achieved up to 96% f1-score on a combined balanced dataset from available hate speech datasets.
arXiv Detail & Related papers (2021-11-02T11:42:54Z)
Hate speech detection using static BERT embeddings [0.9176056742068814]
Hate speech is emerging as a major concern, where it expresses abusive speech that targets specific group characteristics. In this paper, we analyze the performance of hate speech detection by replacing or integrating the word embeddings. In comparison to fine-tuned BERT, one metric that significantly improved is specificity.
arXiv Detail & Related papers (2021-06-29T16:17:10Z)
Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities. We study the evolution and spread of anti-Asian hate speech through the lens of Twitter. We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z)
Intersectional Bias in Hate Speech and Abusive Language Datasets [0.3149883354098941]
African American tweets were up to 3.7 times more likely to be labeled as abusive. African American male tweets were up to 77% more likely to be labeled as hateful. This study provides the first systematic evidence on intersectional bias in datasets of hate speech and abusive language.
arXiv Detail & Related papers (2020-05-12T16:58:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.