Hate Speech Criteria: A Modular Approach to Task-Specific Hate Speech
Definitions
- URL: http://arxiv.org/abs/2206.15455v1
- Date: Thu, 30 Jun 2022 17:50:16 GMT
- Title: Hate Speech Criteria: A Modular Approach to Task-Specific Hate Speech
Definitions
- Authors: Urja Khurana, Ivar Vermeulen, Eric Nalisnick, Marloes van Noorloos and
Antske Fokkens
- Abstract summary: We present textithate speech criteria, developed with perspectives from law and social science.
We argue that the goal and exact task developers have in mind should determine how the scope of textithate speech is defined.
- Score: 1.3274508420845537
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: \textbf{Offensive Content Warning}: This paper contains offensive language
only for providing examples that clarify this research and do not reflect the
authors' opinions. Please be aware that these examples are offensive and may
cause you distress.
The subjectivity of recognizing \textit{hate speech} makes it a complex task.
This is also reflected by different and incomplete definitions in NLP. We
present \textit{hate speech} criteria, developed with perspectives from law and
social science, with the aim of helping researchers create more precise
definitions and annotation guidelines on five aspects: (1) target groups, (2)
dominance, (3) perpetrator characteristics, (4) type of negative group
reference, and the (5) type of potential consequences/effects. Definitions can
be structured so that they cover a more broad or more narrow phenomenon. As
such, conscious choices can be made on specifying criteria or leaving them
open. We argue that the goal and exact task developers have in mind should
determine how the scope of \textit{hate speech} is defined. We provide an
overview of the properties of English datasets from \url{hatespeechdata.com}
that may help select the most suitable dataset for a specific scenario.
Related papers
- Subjective $\textit{Isms}$? On the Danger of Conflating Hate and Offence
in Abusive Language Detection [5.351398116822836]
We argue that the conflation of hate and offence can invalidate findings on hate speech.
We call for future work to be situated in theory, disentangling hate from its concept, offence.
arXiv Detail & Related papers (2024-03-04T17:56:28Z) - Towards Legally Enforceable Hate Speech Detection for Public Forums [29.225955299645978]
This research introduces a new perspective and task for enforceable hate speech detection.
We use a dataset annotated on violations of eleven possible definitions by legal experts.
Given the challenge of identifying clear, legally enforceable instances of hate speech, we augment the dataset with expert-generated samples and an automatically mined challenge set.
arXiv Detail & Related papers (2023-05-23T04:34:41Z) - PropSegmEnt: A Large-Scale Corpus for Proposition-Level Segmentation and
Entailment Recognition [63.51569687229681]
We argue for the need to recognize the textual entailment relation of each proposition in a sentence individually.
We propose PropSegmEnt, a corpus of over 45K propositions annotated by expert human raters.
Our dataset structure resembles the tasks of (1) segmenting sentences within a document to the set of propositions, and (2) classifying the entailment relation of each proposition with respect to a different yet topically-aligned document.
arXiv Detail & Related papers (2022-12-21T04:03:33Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Fine-Grained Opinion Summarization with Minimal Supervision [48.43506393052212]
FineSum aims to profile a target by extracting opinions from multiple documents.
FineSum automatically identifies opinion phrases from the raw corpus, classifies them into different aspects and sentiments, and constructs multiple fine-grained opinion clusters under each aspect/sentiment.
Both automatic evaluation on the benchmark and quantitative human evaluation validate the effectiveness of our approach.
arXiv Detail & Related papers (2021-10-17T15:16:34Z) - Latent Hatred: A Benchmark for Understanding Implicit Hate Speech [22.420275418616242]
This work introduces a theoretically-justified taxonomy of implicit hate speech and a benchmark corpus with fine-grained labels for each message.
We present systematic analyses of our dataset using contemporary baselines to detect and explain implicit hate speech.
arXiv Detail & Related papers (2021-09-11T16:52:56Z) - An Information Retrieval Approach to Building Datasets for Hate Speech
Detection [3.587367153279349]
A common practice is to only annotate tweets containing known hate words''
A second challenge is that definitions of hate speech tend to be highly variable and subjective.
Our key insight is that the rarity and subjectivity of hate speech are akin to that of relevance in information retrieval (IR)
arXiv Detail & Related papers (2021-06-17T19:25:39Z) - Identifying Offensive Expressions of Opinion in Context [0.0]
It is still a challenge to subjective information extraction systems to identify opinions and feelings in context.
In sentiment-based NLP tasks, there are few resources to information extraction, above all offensive or hateful opinions in context.
This paper provides a new cross-lingual and contextual offensive lexicon, which consists of explicit and implicit offensive and swearing expressions of opinion.
arXiv Detail & Related papers (2021-04-25T18:35:39Z) - Words aren't enough, their order matters: On the Robustness of Grounding
Visual Referring Expressions [87.33156149634392]
We critically examine RefCOg, a standard benchmark for visual referring expression recognition.
We show that 83.7% of test instances do not require reasoning on linguistic structure.
We propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT.
arXiv Detail & Related papers (2020-05-04T17:09:15Z) - A Deep Neural Framework for Contextual Affect Detection [51.378225388679425]
A short and simple text carrying no emotion can represent some strong emotions when reading along with its context.
We propose a Contextual Affect Detection framework which learns the inter-dependence of words in a sentence.
arXiv Detail & Related papers (2020-01-28T05:03:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.