Hateful Messages: A Conversational Data Set of Hate Speech produced by
Adolescents on Discord
- URL: http://arxiv.org/abs/2309.01413v1
- Date: Mon, 4 Sep 2023 07:48:52 GMT
- Title: Hateful Messages: A Conversational Data Set of Hate Speech produced by
Adolescents on Discord
- Authors: Jan Fillies, Silvio Peikert, Adrian Paschke
- Abstract summary: This research addresses the bias of youth language within hate speech classification.
The data set consists of publicly available online messages from the chat platform Discord.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the rise of social media, a rise of hateful content can be observed.
Even though the understanding and definitions of hate speech varies, platforms,
communities, and legislature all acknowledge the problem. Therefore,
adolescents are a new and active group of social media users. The majority of
adolescents experience or witness online hate speech. Research in the field of
automated hate speech classification has been on the rise and focuses on
aspects such as bias, generalizability, and performance. To increase
generalizability and performance, it is important to understand biases within
the data. This research addresses the bias of youth language within hate speech
classification and contributes by providing a modern and anonymized hate speech
youth language data set consisting of 88.395 annotated chat messages. The data
set consists of publicly available online messages from the chat platform
Discord. ~6,42% of the messages were classified by a self-developed annotation
schema as hate speech. For 35.553 messages, the user profiles provided age
annotations setting the average author age to under 20 years old.
Related papers
- ProvocationProbe: Instigating Hate Speech Dataset from Twitter [0.39052860539161904]
textitProvocationProbe is a dataset designed to explore what distinguishes instigating hate speech from general hate speech.
For this study, we collected around twenty thousand tweets from Twitter, encompassing a total of nine global controversies.
arXiv Detail & Related papers (2024-10-25T16:57:59Z) - Analyzing User Characteristics of Hate Speech Spreaders on Social Media [20.57872238271025]
We analyze the role of user characteristics in hate speech resharing across different types of hate speech.
We find that users with little social influence tend to share more hate speech.
Political anti-Trump and anti-right-wing hate is reshared by users with larger social influence.
arXiv Detail & Related papers (2023-10-24T12:17:48Z) - Analyzing Norm Violations in Live-Stream Chat [49.120561596550395]
We study the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms.
We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch.
Our results show that appropriate contextual information can boost moderation performance by 35%.
arXiv Detail & Related papers (2023-05-18T05:58:27Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Hate Speech Detection in Clubhouse [6.942237543984334]
We analyze the collected instances from statistical point of view using the Google Perspective Scores.
Our experiments show that, the Perspective Scores can outperform Bag of Words and Word2Vec as high level text features.
arXiv Detail & Related papers (2021-06-24T11:00:19Z) - Towards generalisable hate speech detection: a review on obstacles and
solutions [6.531659195805749]
This survey paper attempts to summarise how generalisable existing hate speech detection models are.
It sums up existing attempts at addressing the main obstacles, and then proposes directions of future research to improve generalisation in hate speech detection.
arXiv Detail & Related papers (2021-02-17T17:27:48Z) - Speaker De-identification System using Autoencoders and Adversarial
Training [58.720142291102135]
We propose a speaker de-identification system based on adversarial training and autoencoders.
Experimental results show that combining adversarial learning and autoencoders increase the equal error rate of a speaker verification system.
arXiv Detail & Related papers (2020-11-09T19:22:05Z) - Investigating Deep Learning Approaches for Hate Speech Detection in
Social Media [20.974715256618754]
The misuse of freedom of expression has led to the increase of various cyber crimes and anti-social activities.
Hate speech is one such issue that needs to be addressed very seriously as otherwise, this could pose threats to the integrity of the social fabrics.
In this paper, we proposed deep learning approaches utilizing various embeddings for detecting various types of hate speeches in social media.
arXiv Detail & Related papers (2020-05-29T17:28:46Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.