Related papers: Towards Legally Enforceable Hate Speech Detection for Public Forums

Towards Legally Enforceable Hate Speech Detection for Public Forums

URL: http://arxiv.org/abs/2305.13677v2
Date: Wed, 1 Nov 2023 18:30:21 GMT
Title: Towards Legally Enforceable Hate Speech Detection for Public Forums
Authors: Chu Fei Luo, Rohan Bhambhoria, Xiaodan Zhu, Samuel Dahan
Abstract summary: This research introduces a new perspective and task for enforceable hate speech detection. We use a dataset annotated on violations of eleven possible definitions by legal experts. Given the challenge of identifying clear, legally enforceable instances of hate speech, we augment the dataset with expert-generated samples and an automatically mined challenge set.
Score: 29.225955299645978
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hate speech causes widespread and deep-seated societal issues. Proper enforcement of hate speech laws is key for protecting groups of people against harmful and discriminatory language. However, determining what constitutes hate speech is a complex task that is highly open to subjective interpretations. Existing works do not align their systems with enforceable definitions of hate speech, which can make their outputs inconsistent with the goals of regulators. This research introduces a new perspective and task for enforceable hate speech detection centred around legal definitions, and a dataset annotated on violations of eleven possible definitions by legal experts. Given the challenge of identifying clear, legally enforceable instances of hate speech, we augment the dataset with expert-generated samples and an automatically mined challenge set. We experiment with grounding the model decision in these definitions using zero-shot and few-shot prompting. We then report results on several large language models (LLMs). With this task definition, automatic hate speech detection can be more closely aligned to enforceable laws, and hence assist in more rigorous enforcement of legal protections against harmful speech in public forums.

Related papers

HatePRISM: Policies, Platforms, and Research Integration. Advancing NLP for Hate Speech Proactive Mitigation [67.69631485036665]
We conduct a comprehensive examination of hate speech regulations and strategies from three perspectives.<n>Our findings reveal significant inconsistencies in hate speech definitions and moderation practices across jurisdictions.<n>We suggest ideas and research direction for further exploration of a unified framework for automated hate speech moderation.
arXiv Detail & Related papers (2025-07-06T11:25:23Z)
Hate Speech According to the Law: An Analysis for Effective Detection [6.802357986439992]
This study highlights the importance of amplifying research on prosecutable hate speech. It provides insights into effective strategies for combating hate speech within the parameters of legal frameworks.
arXiv Detail & Related papers (2024-12-09T01:58:10Z)
A Hate Speech Moderated Chat Application: Use Case for GDPR and DSA Compliance [0.0]
This research presents a novel application capable of implementing legal and ethical reasoning into the content moderation process. Two use cases fundamental to online communication are presented and implemented using technologies such as GPT-3.5, Solid Pods, and the rule language Prova. The work proposes a novel approach to reason within different legal and ethical definitions of hate speech and plan the fitting counter hate speech.
arXiv Detail & Related papers (2024-10-10T08:28:38Z)
Demarked: A Strategy for Enhanced Abusive Speech Moderation through Counterspeech, Detoxification, and Message Management [71.99446449877038]
We propose a more comprehensive approach called Demarcation scoring abusive speech based on four aspect -- (i) severity scale; (ii) presence of a target; (iii) context scale; (iv) legal scale. Our work aims to inform future strategies for effectively addressing abusive speech online.
arXiv Detail & Related papers (2024-06-27T21:45:33Z)
An Investigation of Large Language Models for Real-World Hate Speech Detection [46.15140831710683]
A major limitation of existing methods is that hate speech detection is a highly contextual problem. Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks. Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech.
arXiv Detail & Related papers (2024-01-07T00:39:33Z)
Hate Speech Detection via Dual Contrastive Learning [25.878271501274245]
We propose a novel dual contrastive learning framework for hate speech detection. Our framework jointly optimize the self-supervised and the supervised contrastive learning loss for capturing span-level information. We conduct experiments on two publicly available English datasets, and experimental results show that the proposed model outperforms the state-of-the-art models.
arXiv Detail & Related papers (2023-07-10T13:23:36Z)
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations. We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z)
Leveraging World Knowledge in Implicit Hate Speech Detection [5.5536024561229205]
We show that real world knowledge about entity mentions in a text does help models better detect hate speech. We also discuss cases where real world knowledge does not add value to hate speech detection.
arXiv Detail & Related papers (2022-12-28T21:23:55Z)
Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic. To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech [22.420275418616242]
This work introduces a theoretically-justified taxonomy of implicit hate speech and a benchmark corpus with fine-grained labels for each message. We present systematic analyses of our dataset using contemporary baselines to detect and explain implicit hate speech.
arXiv Detail & Related papers (2021-09-11T16:52:56Z)
Unsupervised Domain Adaptation for Hate Speech Detection Using a Data Augmentation Approach [6.497816402045099]
We propose an unsupervised domain adaptation approach to augment labeled data for hate speech detection. We show our approach improves Area under the Precision/Recall curve by as much as 42% and recall by as much as 278%.
arXiv Detail & Related papers (2021-07-27T15:01:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.