IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language
- URL: http://arxiv.org/abs/2406.19349v1
- Date: Thu, 27 Jun 2024 17:26:38 GMT
- Title: IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language
- Authors: Lucky Susanto, Musa Izzanardi Wijanarko, Prasetia Anugrah Pratama, Traci Hong, Ika Idris, Alham Fikri Aji, Derry Wijaya,
- Abstract summary: We introduce IndoToxic2024, a comprehensive Indonesian hate speech and toxicity classification dataset.
Comprising 43,692 entries annotated by 19 diverse individuals, the dataset focuses on texts targeting vulnerable groups.
We establish baselines for seven binary classification tasks, achieving a macro-F1 score of 0.78 with a BERT model fine-tuned for hate speech classification.
- Score: 11.463652750122398
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Hate speech poses a significant threat to social harmony. Over the past two years, Indonesia has seen a ten-fold increase in the online hate speech ratio, underscoring the urgent need for effective detection mechanisms. However, progress is hindered by the limited availability of labeled data for Indonesian texts. The condition is even worse for marginalized minorities, such as Shia, LGBTQ, and other ethnic minorities because hate speech is underreported and less understood by detection tools. Furthermore, the lack of accommodation for subjectivity in current datasets compounds this issue. To address this, we introduce IndoToxic2024, a comprehensive Indonesian hate speech and toxicity classification dataset. Comprising 43,692 entries annotated by 19 diverse individuals, the dataset focuses on texts targeting vulnerable groups in Indonesia, specifically during the hottest political event in the country: the presidential election. We establish baselines for seven binary classification tasks, achieving a macro-F1 score of 0.78 with a BERT model (IndoBERTweet) fine-tuned for hate speech classification. Furthermore, we demonstrate how incorporating demographic information can enhance the zero-shot performance of the large language model, gpt-3.5-turbo. However, we also caution that an overemphasis on demographic information can negatively impact the fine-tuned model performance due to data fragmentation.
Related papers
- Hate Speech Detection and Classification in Amharic Text with Deep Learning [4.834669033093363]
We develop Amharic hate speech data and SBi-LSTM deep learning model that can detect and classify text into four categories of hate speech.
We have annotated 5k Amharic social media post and comment data into four categories.
The model achieves a 94.8 F1-score performance.
arXiv Detail & Related papers (2024-08-07T15:46:45Z) - Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - Uncovering Political Hate Speech During Indian Election Campaign: A New
Low-Resource Dataset and Baselines [3.3228144010758593]
IEHate dataset contains 11,457 manually annotated Hindi tweets related to the Indian Assembly Election Campaign from November 1, 2021, to March 9, 2022.
We benchmark the dataset using a range of machine learning, deep learning, and transformer-based algorithms.
In particular, the relatively higher score of human evaluation over algorithms emphasizes the importance of utilizing both human and automated approaches for effective hate speech moderation.
arXiv Detail & Related papers (2023-06-26T15:17:54Z) - Overview of Abusive and Threatening Language Detection in Urdu at FIRE
2021 [50.591267188664666]
We present two shared tasks of abusive and threatening language detection for the Urdu language.
We present two manually annotated datasets containing tweets labelled as (i) Abusive and Non-Abusive, and (ii) Threatening and Non-Threatening.
For both subtasks, m-Bert based transformer model showed the best performance.
arXiv Detail & Related papers (2022-07-14T07:38:13Z) - Korean Online Hate Speech Dataset for Multilabel Classification: How Can
Social Science Improve Dataset on Hate Speech? [0.4893345190925178]
We suggest a multilabel Korean online hate speech dataset that covers seven categories of hate speech.
Our 35K dataset consists of 24K online comments with Krippendorff's Alpha label.
Unlike the conventional binary hate and non-hate dichotomy approach, we designed a dataset considering both the cultural and linguistic context.
arXiv Detail & Related papers (2022-04-07T07:29:06Z) - Listening to Affected Communities to Define Extreme Speech: Dataset and
Experiments [1.1417805445492082]
We present XTREMESPEECH, a new hate speech dataset containing 20,297 social media passages from Brazil, Germany, India and Kenya.
The key novelty is that we directly involve the affected communities in collecting and annotating the data.
This inclusive approach results in datasets more representative of actually occurring online speech.
arXiv Detail & Related papers (2022-03-22T14:24:56Z) - COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences.
We also propose textscCOLDetector to study output offensiveness of popular Chinese language models.
Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Reducing Target Group Bias in Hate Speech Detectors [56.94616390740415]
We show that text classification models trained on large publicly available datasets, may significantly under-perform on several protected groups.
We propose to perform token-level hate sense disambiguation, and utilize tokens' hate sense representations for detection.
arXiv Detail & Related papers (2021-12-07T17:49:34Z) - "Stop Asian Hate!" : Refining Detection of Anti-Asian Hate Speech During
the COVID-19 Pandemic [2.5227595609842206]
COVID-19 pandemic has fueled a surge in anti-Asian xenophobia and prejudice.
We create and annotate a corpus of Twitter tweets using 2 experimental approaches to explore anti-Asian abusive and hate speech.
arXiv Detail & Related papers (2021-12-04T06:55:19Z) - Racism is a Virus: Anti-Asian Hate and Counterspeech in Social Media
during the COVID-19 Crisis [51.39895377836919]
COVID-19 has sparked racism and hate on social media targeted towards Asian communities.
We study the evolution and spread of anti-Asian hate speech through the lens of Twitter.
We create COVID-HATE, the largest dataset of anti-Asian hate and counterspeech spanning 14 months.
arXiv Detail & Related papers (2020-05-25T21:58:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.