Understanding and Detecting Hateful Content using Contrastive Learning
- URL: http://arxiv.org/abs/2201.08387v2
- Date: Tue, 17 May 2022 01:34:45 GMT
- Title: Understanding and Detecting Hateful Content using Contrastive Learning
- Authors: Felipe Gonz\'alez-Pizarro, Savvas Zannettou
- Abstract summary: This work contributes to research efforts to detect and understand hateful content on the Web.
We devise a methodology to identify a set of Antisemitic and Islamophobic hateful textual phrases.
We then use OpenAI's CLIP to identify images that are highly similar to our Antisemitic/Islamophobic textual phrases.
- Score: 0.9391375268580806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The spread of hate speech and hateful imagery on the Web is a significant
problem that needs to be mitigated to improve our Web experience. This work
contributes to research efforts to detect and understand hateful content on the
Web by undertaking a multimodal analysis of Antisemitism and Islamophobia on
4chan's /pol/ using OpenAI's CLIP. This large pre-trained model uses the
Contrastive Learning paradigm. We devise a methodology to identify a set of
Antisemitic and Islamophobic hateful textual phrases using Google's Perspective
API and manual annotations. Then, we use OpenAI's CLIP to identify images that
are highly similar to our Antisemitic/Islamophobic textual phrases. By running
our methodology on a dataset that includes 66M posts and 5.8M images shared on
4chan's /pol/ for 18 months, we detect 173K posts containing 21K
Antisemitic/Islamophobic images and 246K posts that include 420 hateful
phrases. Among other things, we find that we can use OpenAI's CLIP model to
detect hateful content with an accuracy score of 0.81 (F1 score = 0.54). By
comparing CLIP with two baselines proposed by the literature, we find that CLIP
outperforms them, in terms of accuracy, precision, and F1 score, in detecting
Antisemitic/Islamophobic images. Also, we find that Antisemitic/Islamophobic
imagery is shared in a similar number of posts on 4chan's /pol/ compared to
Antisemitic/Islamophobic textual phrases, highlighting the need to design more
tools for detecting hateful imagery. Finally, we make available (upon request)
a dataset of 246K posts containing 420 Antisemitic/Islamophobic phrases and 21K
likely Antisemitic/Islamophobic images (automatically detected by CLIP) that
can assist researchers in further understanding Antisemitism and Islamophobia.
Related papers
- Analyzing Islamophobic Discourse Using Semi-Coded Terms and LLMs [2.5081530863229307]
This paper performs a large-scale analysis of specialized, semi-coded Islamophobic terms such as (muzrat, pislam, mudslime, mohammedan, muzzies) floated on extremist social platforms.<n>Many of these terms appear lexically neutral or ambiguous outside of specific contexts, making them difficult for both human moderators and automated systems to reliably identify as hate speech.
arXiv Detail & Related papers (2025-03-24T01:41:24Z) - MIMIC: Multimodal Islamophobic Meme Identification and Classification [1.2647816797166167]
Anti-Muslim hate speech has emerged within memes, characterized by context-dependent and rhetorical messages.
This work presents a novel dataset and proposes a classifier based on the Vision-and-Language Transformer (ViLT) specifically tailored to identify anti-Muslim hate within memes.
arXiv Detail & Related papers (2024-12-01T05:44:01Z) - Exploiting Hatred by Targets for Hate Speech Detection on Vietnamese Social Media Texts [0.0]
We first introduce the ViTHSD - a targeted hate speech detection dataset for Vietnamese Social Media Texts.
The dataset contains 10K comments, each comment is labeled to specific targets with three levels: clean, offensive, and hate.
The inter-annotator agreement obtained from the dataset is 0.45 by Cohen's Kappa index, which is indicated as a moderate level.
arXiv Detail & Related papers (2024-04-30T04:16:55Z) - Overview of the HASOC Subtrack at FIRE 2023: Identification of Tokens
Contributing to Explicit Hate in English by Span Detection [40.10513344092731]
Reactively, using black-box models to identify hateful content can perplex users as to why their posts were automatically flagged as hateful.
proactive mitigation can be achieved by suggesting rephrasing before a post is made public.
arXiv Detail & Related papers (2023-11-16T12:01:19Z) - How toxic is antisemitism? Potentials and limitations of automated
toxicity scoring for antisemitic online content [0.0]
Perspective API is a text toxicity assessment service by Google and Jigsaw.
We show how toxic antisemitic texts are rated and how the toxicity scores differ regarding different subforms of antisemitism.
We show that, on a basic level, Perspective API recognizes antisemitic content as toxic, but shows critical weaknesses with respect to non-explicit forms of antisemitism.
arXiv Detail & Related papers (2023-10-05T15:23:04Z) - Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with
Text [130.89493542553151]
In-context vision and language models like Flamingo support arbitrarily interleaved sequences of images and text as input.
To support this interface, pretraining occurs over web corpora that similarly contain interleaved images+text.
We release Multimodal C4, an augmentation of the popular text-only C4 corpus with images interleaved.
arXiv Detail & Related papers (2023-04-14T06:17:46Z) - On the Evolution of (Hateful) Memes by Means of Multimodal Contrastive
Learning [18.794226796466962]
We study how hateful memes are created by combining visual elements from multiple images or fusing textual information with a hateful image.
Using our framework on a dataset extracted from 4chan, we find 3.3K variants of the Happy Merchant meme.
We envision that our framework can be used to aid human moderators by flagging new variants of hateful memes.
arXiv Detail & Related papers (2022-12-13T13:38:04Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Twitter-COMMs: Detecting Climate, COVID, and Military Multimodal
Misinformation [83.2079454464572]
This paper describes our approach to the Image-Text Inconsistency Detection challenge of the DARPA Semantic Forensics (SemaFor) Program.
We collect Twitter-COMMs, a large-scale multimodal dataset with 884k tweets relevant to the topics of Climate Change, COVID-19, and Military Vehicles.
We train our approach, based on the state-of-the-art CLIP model, leveraging automatically generated random and hard negatives.
arXiv Detail & Related papers (2021-12-16T03:37:20Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z) - Keystroke Biometrics in Response to Fake News Propagation in a Global
Pandemic [77.79066811371978]
This work proposes and analyzes the use of keystroke biometrics for content de-anonymization.
Fake news have become a powerful tool to manipulate public opinion, especially during major events.
arXiv Detail & Related papers (2020-05-15T17:56:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.