DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection
- URL: http://arxiv.org/abs/2011.01861v1
- Date: Tue, 3 Nov 2020 17:32:50 GMT
- Title: DeL-haTE: A Deep Learning Tunable Ensemble for Hate Speech Detection
- Authors: Joshua Melton, Arunkumar Bagavathi, Siddharth Krishnan
- Abstract summary: Online hate speech on social media has become a fast-growing problem in recent times.
Three key challenges in automated detection and classification of hateful content are the lack of clearly labeled data, evolving vocabulary and lexicon, and the lack of baseline models for fringe outlets such as Gab.
In this work, we propose a novel framework with three major contributions.
- Score: 0.04297070083645048
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Online hate speech on social media has become a fast-growing problem in
recent times. Nefarious groups have developed large content delivery networks
across several main-stream (Twitter and Facebook) and fringe (Gab, 4chan,
8chan, etc.) outlets to deliver cascades of hate messages directed both at
individuals and communities. Thus addressing these issues has become a top
priority for large-scale social media outlets. Three key challenges in
automated detection and classification of hateful content are the lack of
clearly labeled data, evolving vocabulary and lexicon - hashtags, emojis, etc.
- and the lack of baseline models for fringe outlets such as Gab. In this work,
we propose a novel framework with three major contributions. (a) We engineer an
ensemble of deep learning models that combines the strengths of
state-of-the-art approaches, (b) we incorporate a tuning factor into this
framework that leverages transfer learning to conduct automated hate speech
classification on unlabeled datasets, like Gab, and (c) we develop a weak
supervised learning methodology that allows our framework to train on unlabeled
data. Our ensemble models achieve an 83% hate recall on the HON dataset,
surpassing the performance of the state-of-the-art deep models. We demonstrate
that weak supervised training in combination with classifier tuning
significantly increases model performance on unlabeled data from Gab, achieving
a hate recall of 67%.
Related papers
- A Target-Aware Analysis of Data Augmentation for Hate Speech Detection [3.858155067958448]
Hate speech is one of the main threats posed by the widespread use of social networks.
We investigate the possibility of augmenting existing data with generative language models, reducing target imbalance.
For some hate categories such as origin, religion, and disability, hate speech classification using augmented data for training improves by more than 10% F1 over the no augmentation baseline.
arXiv Detail & Related papers (2024-10-10T15:46:27Z) - MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection [2.433983268807517]
Hate speech poses significant social, psychological, and occasionally physical threats to targeted individuals and communities.
Current computational linguistic approaches for tackling this phenomenon rely on labelled social media datasets for training.
We scrutinized over 60 datasets, selectively integrating those pertinent into MetaHate.
Our findings contribute to a deeper understanding of the existing datasets, paving the way for training more robust and adaptable models.
arXiv Detail & Related papers (2024-01-12T11:54:53Z) - Into the LAIONs Den: Investigating Hate in Multimodal Datasets [67.21783778038645]
This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B.
We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively.
We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
arXiv Detail & Related papers (2023-11-06T19:00:05Z) - Understanding writing style in social media with a supervised
contrastively pre-trained transformer [57.48690310135374]
Online Social Networks serve as fertile ground for harmful behavior, ranging from hate speech to the dissemination of disinformation.
We introduce the Style Transformer for Authorship Representations (STAR), trained on a large corpus derived from public sources of 4.5 x 106 authored texts.
Using a support base of 8 documents of 512 tokens, we can discern authors from sets of up to 1616 authors with at least 80% accuracy.
arXiv Detail & Related papers (2023-10-17T09:01:17Z) - Meta-Learning Online Adaptation of Language Models [88.8947656843812]
Large language models encode impressively broad world knowledge in their parameters.
However, the knowledge in static language models falls out of date, limiting the model's effective "shelf life"
arXiv Detail & Related papers (2023-05-24T11:56:20Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Revisiting Self-Training for Few-Shot Learning of Language Model [61.173976954360334]
Unlabeled data carry rich task-relevant information, they are proven useful for few-shot learning of language model.
In this work, we revisit the self-training technique for language model fine-tuning and present a state-of-the-art prompt-based few-shot learner, SFLM.
arXiv Detail & Related papers (2021-10-04T08:51:36Z) - Identifying and Categorizing Offensive Language in Social Media [0.0]
This study provides a description of a classification system built for SemEval 2019 Task 6: OffensEval.
We trained machine learning and deep learning models along with data preprocessing and sampling techniques to come up with the best results.
arXiv Detail & Related papers (2021-04-10T22:53:43Z) - Leveraging Multi-domain, Heterogeneous Data using Deep Multitask
Learning for Hate Speech Detection [21.410160004193916]
We propose a Convolution Neural Network based multi-task learning models (MTLs)footnotecode to leverage information from multiple sources.
Empirical analysis performed on three benchmark datasets shows the efficacy of the proposed approach.
arXiv Detail & Related papers (2021-03-23T09:31:01Z) - Leveraging cross-platform data to improve automated hate speech
detection [0.0]
Most existing approaches for hate speech detection focus on a single social media platform in isolation.
Here we propose a new cross-platform approach to detect hate speech which leverages multiple datasets and classification models from different platforms.
We demonstrate how this approach outperforms existing models, and achieves good performance when tested on messages from novel social media platforms.
arXiv Detail & Related papers (2021-02-09T15:52:34Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.