How toxic is antisemitism? Potentials and limitations of automated
toxicity scoring for antisemitic online content
- URL: http://arxiv.org/abs/2310.04465v1
- Date: Thu, 5 Oct 2023 15:23:04 GMT
- Title: How toxic is antisemitism? Potentials and limitations of automated
toxicity scoring for antisemitic online content
- Authors: Helena Mihaljevi\'c and Elisabeth Steffen
- Abstract summary: Perspective API is a text toxicity assessment service by Google and Jigsaw.
We show how toxic antisemitic texts are rated and how the toxicity scores differ regarding different subforms of antisemitism.
We show that, on a basic level, Perspective API recognizes antisemitic content as toxic, but shows critical weaknesses with respect to non-explicit forms of antisemitism.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Perspective API, a popular text toxicity assessment service by Google and
Jigsaw, has found wide adoption in several application areas, notably content
moderation, monitoring, and social media research. We examine its potentials
and limitations for the detection of antisemitic online content that, by
definition, falls under the toxicity umbrella term. Using a manually annotated
German-language dataset comprising around 3,600 posts from Telegram and
Twitter, we explore as how toxic antisemitic texts are rated and how the
toxicity scores differ regarding different subforms of antisemitism and the
stance expressed in the texts. We show that, on a basic level, Perspective API
recognizes antisemitic content as toxic, but shows critical weaknesses with
respect to non-explicit forms of antisemitism and texts taking a critical
stance towards it. Furthermore, using simple text manipulations, we demonstrate
that the use of widespread antisemitic codes can substantially reduce API
scores, making it rather easy to bypass content moderation based on the
service's results.
Related papers
- Monitoring the evolution of antisemitic discourse on extremist social media using BERT [3.3037858066178662]
Racism and intolerance on social media contribute to a toxic online environment which may spill offline to foster hatred.
Tracking antisemitic themes and their associated terminology over time in online discussions could help monitor the sentiments of their participants.
arXiv Detail & Related papers (2024-02-06T20:34:49Z) - Using LLMs to discover emerging coded antisemitic hate-speech in
extremist social media [4.104047892870216]
This paper proposes a methodology for detecting emerging coded hate-laden terminology.
The methodology is tested in the context of online antisemitic discourse.
arXiv Detail & Related papers (2024-01-19T17:40:50Z) - Comprehensive Assessment of Toxicity in ChatGPT [49.71090497696024]
We evaluate the toxicity in ChatGPT by utilizing instruction-tuning datasets.
prompts in creative writing tasks can be 2x more likely to elicit toxic responses.
Certain deliberately toxic prompts, designed in earlier studies, no longer yield harmful responses.
arXiv Detail & Related papers (2023-11-03T14:37:53Z) - Antisemitic Messages? A Guide to High-Quality Annotation and a Labeled
Dataset of Tweets [0.0]
We create a labeled dataset of 6,941 tweets that cover a wide range of topics common in conversations about Jews, Israel, and antisemitism.
The dataset includes 1,250 tweets (18%) that are antisemitic according to the International Holocaust Remembrance Alliance (IHRA) definition of antisemitism.
arXiv Detail & Related papers (2023-04-28T02:52:38Z) - Detoxifying Text with MaRCo: Controllable Revision with Experts and
Anti-Experts [57.38912708076231]
We introduce MaRCo, a detoxification algorithm that combines controllable generation and text rewriting methods.
MaRCo uses likelihoods under a non-toxic LM and a toxic LM to find candidate words to mask and potentially replace.
We evaluate our method on several subtle toxicity and microaggressions datasets, and show that it not only outperforms baselines on automatic metrics, but MaRCo's rewrites are preferred 2.1 $times$ more in human evaluation.
arXiv Detail & Related papers (2022-12-20T18:50:00Z) - Codes, Patterns and Shapes of Contemporary Online Antisemitism and
Conspiracy Narratives -- an Annotation Guide and Labeled German-Language
Dataset in the Context of COVID-19 [0.0]
Antisemitic and conspiracy theory content on the Internet makes data-driven algorithmic approaches essential.
We develop an annotation guide for antisemitic and conspiracy theory online content in the context of the COVID-19 pandemic.
We provide working definitions, including specific forms of antisemitism such as encoded and post-Holocaust antisemitism.
arXiv Detail & Related papers (2022-10-13T10:32:39Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - A New Generation of Perspective API: Efficient Multilingual
Character-level Transformers [66.9176610388952]
We present the fundamentals behind the next version of the Perspective API from Google Jigsaw.
At the heart of the approach is a single multilingual token-free Charformer model.
We demonstrate that by forgoing static vocabularies, we gain flexibility across a variety of settings.
arXiv Detail & Related papers (2022-02-22T20:55:31Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - "Subverting the Jewtocracy": Online Antisemitism Detection Using
Multimodal Deep Learning [23.048101866010445]
We present the first work in the direction of automated multimodal detection of online antisemitism.
We label two datasets with 3,102 and 3,509 social media posts from Twitter and Gab respectively.
We present a multimodal deep learning system that detects the presence of antisemitic content and its specific antisemitism category using text and images from posts.
arXiv Detail & Related papers (2021-04-13T05:22:55Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.