Codes, Patterns and Shapes of Contemporary Online Antisemitism and
Conspiracy Narratives -- an Annotation Guide and Labeled German-Language
Dataset in the Context of COVID-19
- URL: http://arxiv.org/abs/2210.07934v1
- Date: Thu, 13 Oct 2022 10:32:39 GMT
- Title: Codes, Patterns and Shapes of Contemporary Online Antisemitism and
Conspiracy Narratives -- an Annotation Guide and Labeled German-Language
Dataset in the Context of COVID-19
- Authors: Elisabeth Steffen, Helena Mihaljevi\'c, Milena Pustet, Nyco Bischoff,
Mar\'ia do Mar Castro Varela, Yener Bayramo\u{g}lu, Bahar Oghalai
- Abstract summary: Antisemitic and conspiracy theory content on the Internet makes data-driven algorithmic approaches essential.
We develop an annotation guide for antisemitic and conspiracy theory online content in the context of the COVID-19 pandemic.
We provide working definitions, including specific forms of antisemitism such as encoded and post-Holocaust antisemitism.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Over the course of the COVID-19 pandemic, existing conspiracy theories were
refreshed and new ones were created, often interwoven with antisemitic
narratives, stereotypes and codes. The sheer volume of antisemitic and
conspiracy theory content on the Internet makes data-driven algorithmic
approaches essential for anti-discrimination organizations and researchers
alike. However, the manifestation and dissemination of these two interrelated
phenomena is still quite under-researched in scholarly empirical research of
large text corpora. Algorithmic approaches for the detection and classification
of specific contents usually require labeled datasets, annotated based on
conceptually sound guidelines. While there is a growing number of datasets for
the more general phenomenon of hate speech, the development of corpora and
annotation guidelines for antisemitic and conspiracy content is still in its
infancy, especially for languages other than English. We contribute to closing
this gap by developing an annotation guide for antisemitic and conspiracy
theory online content in the context of the COVID-19 pandemic. We provide
working definitions, including specific forms of antisemitism such as encoded
and post-Holocaust antisemitism. We use these to annotate a German-language
dataset consisting of ~3,700 Telegram messages sent between 03/2020 and
12/2021.
Related papers
- What distinguishes conspiracy from critical narratives? A computational analysis of oppositional discourse [42.0918839418817]
We propose a novel topic-agnostic annotation scheme that distinguishes between conspiracies and critical texts.
We also contribute with the multilingual XAI-DisInfodemics corpus (English and Spanish), which contains a high-quality annotation of Telegram messages.
arXiv Detail & Related papers (2024-07-15T14:18:47Z) - Wav2Gloss: Generating Interlinear Glossed Text from Speech [78.64412090339044]
We propose Wav2Gloss, a task in which four linguistic annotation components are extracted automatically from speech.
We provide various baselines to lay the groundwork for future research on Interlinear Glossed Text generation from speech.
arXiv Detail & Related papers (2024-03-19T21:45:29Z) - Monitoring the evolution of antisemitic discourse on extremist social media using BERT [3.3037858066178662]
Racism and intolerance on social media contribute to a toxic online environment which may spill offline to foster hatred.
Tracking antisemitic themes and their associated terminology over time in online discussions could help monitor the sentiments of their participants.
arXiv Detail & Related papers (2024-02-06T20:34:49Z) - Using LLMs to discover emerging coded antisemitic hate-speech in
extremist social media [4.104047892870216]
This paper proposes a methodology for detecting emerging coded hate-laden terminology.
The methodology is tested in the context of online antisemitic discourse.
arXiv Detail & Related papers (2024-01-19T17:40:50Z) - Into the LAIONs Den: Investigating Hate in Multimodal Datasets [67.21783778038645]
This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B.
We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively.
We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
arXiv Detail & Related papers (2023-11-06T19:00:05Z) - How toxic is antisemitism? Potentials and limitations of automated
toxicity scoring for antisemitic online content [0.0]
Perspective API is a text toxicity assessment service by Google and Jigsaw.
We show how toxic antisemitic texts are rated and how the toxicity scores differ regarding different subforms of antisemitism.
We show that, on a basic level, Perspective API recognizes antisemitic content as toxic, but shows critical weaknesses with respect to non-explicit forms of antisemitism.
arXiv Detail & Related papers (2023-10-05T15:23:04Z) - From Dogwhistles to Bullhorns: Unveiling Coded Rhetoric with Language
Models [73.25963871034858]
We present the first large-scale computational investigation of dogwhistles.
We develop a typology of dogwhistles, curate the largest-to-date glossary of over 300 dogwhistles, and analyze their usage in historical U.S. politicians' speeches.
We show that harmful content containing dogwhistles avoids toxicity detection, highlighting online risks of such coded language.
arXiv Detail & Related papers (2023-05-26T18:00:57Z) - Antisemitic Messages? A Guide to High-Quality Annotation and a Labeled
Dataset of Tweets [0.0]
We create a labeled dataset of 6,941 tweets that cover a wide range of topics common in conversations about Jews, Israel, and antisemitism.
The dataset includes 1,250 tweets (18%) that are antisemitic according to the International Holocaust Remembrance Alliance (IHRA) definition of antisemitism.
arXiv Detail & Related papers (2023-04-28T02:52:38Z) - O-Dang! The Ontology of Dangerous Speech Messages [53.15616413153125]
We present O-Dang!: The Ontology of Dangerous Speech Messages, a systematic and interoperable Knowledge Graph (KG)
O-Dang! is designed to gather and organize Italian datasets into a structured KG, according to the principles shared within the Linguistic Linked Open Data community.
It provides a model for encoding both gold standard and single-annotator labels in the KG.
arXiv Detail & Related papers (2022-07-13T11:50:05Z) - Latent Topology Induction for Understanding Contextualized
Representations [84.7918739062235]
We study the representation space of contextualized embeddings and gain insight into the hidden topology of large language models.
We show there exists a network of latent states that summarize linguistic properties of contextualized representations.
arXiv Detail & Related papers (2022-06-03T11:22:48Z) - "Subverting the Jewtocracy": Online Antisemitism Detection Using
Multimodal Deep Learning [23.048101866010445]
We present the first work in the direction of automated multimodal detection of online antisemitism.
We label two datasets with 3,102 and 3,509 social media posts from Twitter and Gab respectively.
We present a multimodal deep learning system that detects the presence of antisemitic content and its specific antisemitism category using text and images from posts.
arXiv Detail & Related papers (2021-04-13T05:22:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.