Related papers: "It's Not Just Hate'': A Multi-Dimensional Perspective on Detecting Harmful Speech Online

"It's Not Just Hate'': A Multi-Dimensional Perspective on Detecting Harmful Speech Online

URL: http://arxiv.org/abs/2210.15870v1
Date: Fri, 28 Oct 2022 03:34:50 GMT
Title: "It's Not Just Hate'': A Multi-Dimensional Perspective on Detecting Harmful Speech Online
Authors: Federico Bianchi, Stefanie Anja Hills, Patricia Rossini, Dirk Hovy, Rebekah Tromble, Nava Tintarev
Abstract summary: We show that a more fine-grained multi-label approach to predicting incivility and hateful or intolerant content addresses both conceptual and performance issues. We release a novel dataset of over 40,000 tweets about immigration from the US and UK, annotated with six labels for different aspects of incivility and intolerance.
Score: 26.10949184015077
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though, annotation decisions are governed by optimizing time or annotator agreement. We make a case for nuanced efforts in an interdisciplinary setting for annotating offensive online speech. Detecting offensive content is rapidly becoming one of the most important real-world NLP tasks. However, most datasets use a single binary label, e.g., for hate or incivility, even though each concept is multi-faceted. This modeling choice severely limits nuanced insights, but also performance. We show that a more fine-grained multi-label approach to predicting incivility and hateful or intolerant content addresses both conceptual and performance issues. We release a novel dataset of over 40,000 tweets about immigration from the US and UK, annotated with six labels for different aspects of incivility and intolerance. Our dataset not only allows for a more nuanced understanding of harmful speech online, models trained on it also outperform or match performance on benchmark datasets.

Related papers

HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model for online comments [2.162419921663162]
We propose a novel end-to-end model, HCDIR, for Hate Context Detection, and Hate Intensity Reduction in social media posts. We fine-tuned several pre-trained language models to detect hateful comments to ascertain the best-performing hateful comments detection model.
arXiv Detail & Related papers (2023-12-20T17:05:46Z)
Into the LAIONs Den: Investigating Hate in Multimodal Datasets [67.21783778038645]
This paper investigates the effect of scaling datasets on hateful content through a comparative audit of two datasets: LAION-400M and LAION-2B. We found that hate content increased by nearly 12% with dataset scale, measured both qualitatively and quantitatively. We also found that filtering dataset contents based on Not Safe For Work (NSFW) values calculated based on images alone does not exclude all the harmful content in alt-text.
arXiv Detail & Related papers (2023-11-06T19:00:05Z)
Revisiting Hate Speech Benchmarks: From Data Curation to System Deployment [26.504056750529124]
We present GOTHate, a large-scale code-mixed crowdsourced dataset of around 51k posts for hate speech detection from Twitter. We benchmark it with 10 recent baselines and investigate how adding endogenous signals enhances the hate speech detection task. Our solution HEN-mBERT is a modular, multilingual, mixture-of-experts model that enriches the linguistic subspace with latent endogenous signals.
arXiv Detail & Related papers (2023-06-01T19:36:52Z)
How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have [58.23138483086277]
In this work we leverage datasets we already have, covering a wide range of tasks related to abusive language detection. Our goal is to build models cheaply for a new target label set and/or language, using only a few training examples of the target domain. Our experiments show that using already existing datasets and only a few-shots of the target task the performance of models improve both monolingually and across languages.
arXiv Detail & Related papers (2023-05-23T14:04:12Z)
BERT-based Ensemble Approaches for Hate Speech Detection [1.8734449181723825]
This paper focuses on classifying hate speech in social media using multiple deep models. We evaluated with several ensemble techniques, including soft voting, maximum value, hard voting and stacking. Experiments have shown good results especially the ensemble models, where stacking gave F1 score of 97% on Davidson dataset and aggregating ensembles 77% on the DHO dataset.
arXiv Detail & Related papers (2022-09-14T09:08:24Z)
Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public Figures [3.825159708387601]
This work proposes a new Multi-task Learning pipeline that trains simultaneously across multiple hate speech datasets. We show strong results when examining the generalization error in train-test splits and substantial improvements when predicting on previously unseen datasets.
arXiv Detail & Related papers (2022-08-22T21:13:38Z)
On Guiding Visual Attention with Language Specification [76.08326100891571]
We use high-level language specification as advice for constraining the classification evidence to task-relevant features, instead of distractors. We show that supervising spatial attention in this way improves performance on classification tasks with biased and noisy data.
arXiv Detail & Related papers (2022-02-17T22:40:19Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Meta-Learning with Variational Semantic Memory for Word Sense Disambiguation [56.830395467247016]
We propose a model of semantic memory for WSD in a meta-learning setting. Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork. We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce scenarios.
arXiv Detail & Related papers (2021-06-05T20:40:01Z)
Cross-lingual hate speech detection based on multilingual domain-specific word embeddings [4.769747792846004]
We propose to address the problem of multilingual hate speech detection from the perspective of transfer learning. Our goal is to determine if knowledge from one particular language can be used to classify other language. We show that the use of our simple yet specific multilingual hate representations improves classification results.
arXiv Detail & Related papers (2021-04-30T02:24:50Z)
Trawling for Trolling: A Dataset [56.1778095945542]
We present a dataset that models trolling as a subcategory of offensive content. The dataset has 12,490 samples, split across 5 classes; Normal, Profanity, Trolling, Derogatory and Hate Speech.
arXiv Detail & Related papers (2020-08-02T17:23:55Z)
Transfer Learning for Hate Speech Detection in Social Media [14.759208309842178]
This paper uses a transfer learning technique to leverage two independent datasets jointly. We build an interpretable two-dimensional visualization tool of the constructed hate speech representation -- dubbed the Map of Hate. We show that the joint representation boosts prediction performances when only a limited amount of supervision is available.
arXiv Detail & Related papers (2019-06-10T08:00:58Z)

This list is automatically generated from the titles and abstracts of the papers in this site.