Character-level HyperNetworks for Hate Speech Detection
- URL: http://arxiv.org/abs/2111.06336v1
- Date: Thu, 11 Nov 2021 17:48:31 GMT
- Title: Character-level HyperNetworks for Hate Speech Detection
- Authors: Tomer Wullach, Amir Adler, Einat Minkov
- Abstract summary: Automated methods for hate speech detection typically employ state-of-the-art deep learning (DL)-based text classifiers.
We present HyperNetworks for hate speech detection, a special class of DL networks whose weights are regulated by a small-scale auxiliary network.
We achieve performance that is comparable or better than state-of-the-art language models, which are pre-trained and orders of magnitude larger.
- Score: 3.50640918825436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The massive spread of hate speech, hateful content targeted at specific
subpopulations, is a problem of critical social importance. Automated methods
for hate speech detection typically employ state-of-the-art deep learning
(DL)-based text classifiers-very large pre-trained neural language models of
over 100 million parameters, adapting these models to the task of hate speech
detection using relevant labeled datasets. Unfortunately, there are only
numerous labeled datasets of limited size that are available for this purpose.
We make several contributions with high potential for advancing this state of
affairs. We present HyperNetworks for hate speech detection, a special class of
DL networks whose weights are regulated by a small-scale auxiliary network.
These architectures operate at character-level, as opposed to word-level, and
are several magnitudes of order smaller compared to the popular DL classifiers.
We further show that training hate detection classifiers using large amounts of
automatically generated examples in a procedure named as it data augmentation
is beneficial in general, yet this practice especially boosts the performance
of the proposed HyperNetworks. In fact, we achieve performance that is
comparable or better than state-of-the-art language models, which are
pre-trained and orders of magnitude larger, using this approach, as evaluated
using five public hate speech datasets.
Related papers
- Hate Speech Detection in Limited Data Contexts using Synthetic Data
Generation [1.9506923346234724]
We propose a data augmentation approach that addresses the problem of lack of data for online hate speech detection in limited data contexts.
We present three methods to synthesize new examples of hate speech data in a target language that retains the hate sentiment in the original examples but transfers the hate targets.
Our findings show that a model trained on synthetic data performs comparably to, and in some cases outperforms, a model trained only on the samples available in the target domain.
arXiv Detail & Related papers (2023-10-04T15:10:06Z) - Efficient Spoken Language Recognition via Multilabel Classification [53.662747523872305]
We show that our models obtain competitive results while being orders of magnitude smaller and faster than current state-of-the-art methods.
Our multilabel strategy is more robust to unseen non-target languages compared to multiclass classification.
arXiv Detail & Related papers (2023-06-02T23:04:19Z) - SLICER: Learning universal audio representations using low-resource
self-supervised pre-training [53.06337011259031]
We present a new Self-Supervised Learning approach to pre-train encoders on unlabeled audio data.
Our primary aim is to learn audio representations that can generalize across a large variety of speech and non-speech tasks.
arXiv Detail & Related papers (2022-11-02T23:45:33Z) - Deep Learning for Hate Speech Detection: A Comparative Study [54.42226495344908]
We present here a large-scale empirical comparison of deep and shallow hate-speech detection methods.
Our goal is to illuminate progress in the area, and identify strengths and weaknesses in the current state-of-the-art.
In doing so we aim to provide guidance as to the use of hate-speech detection in practice, quantify the state-of-the-art, and identify future research directions.
arXiv Detail & Related papers (2022-02-19T03:48:20Z) - On Guiding Visual Attention with Language Specification [76.08326100891571]
We use high-level language specification as advice for constraining the classification evidence to task-relevant features, instead of distractors.
We show that supervising spatial attention in this way improves performance on classification tasks with biased and noisy data.
arXiv Detail & Related papers (2022-02-17T22:40:19Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Speaker-Conditioned Hierarchical Modeling for Automated Speech Scoring [60.55025339250815]
We propose a novel deep learning technique for non-native ASS, called speaker-conditioned hierarchical modeling.
We take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. In our technique, we take advantage of the fact that oral proficiency tests rate multiple responses for a candidate. We extract context from these responses and feed them as additional speaker-specific context to our network to score a particular response.
arXiv Detail & Related papers (2021-08-30T07:00:28Z) - Leveraging Multi-domain, Heterogeneous Data using Deep Multitask
Learning for Hate Speech Detection [21.410160004193916]
We propose a Convolution Neural Network based multi-task learning models (MTLs)footnotecode to leverage information from multiple sources.
Empirical analysis performed on three benchmark datasets shows the efficacy of the proposed approach.
arXiv Detail & Related papers (2021-03-23T09:31:01Z) - A study of text representations in Hate Speech Detection [0.0]
Current EU and US legislation against hateful language has led to automatic tools being a necessary component of the Hate Speech detection task and pipeline.
In this study, we examine the performance of several, diverse text representation techniques paired with multiple classification algorithms, on the automatic Hate Speech detection task.
arXiv Detail & Related papers (2021-02-08T20:39:17Z) - Towards Hate Speech Detection at Large via Deep Generative Modeling [4.080068044420974]
Hate speech detection is a critical problem in social media platforms.
We present a dataset of 1 million realistic hate and non-hate sequences, produced by a deep generative language model.
We demonstrate consistent and significant performance improvements across five public hate speech datasets.
arXiv Detail & Related papers (2020-05-13T15:25:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.