Related papers: A Group-Specific Approach to NLP for Hate Speech Detection

A Group-Specific Approach to NLP for Hate Speech Detection

URL: http://arxiv.org/abs/2304.11223v1
Date: Fri, 21 Apr 2023 19:08:49 GMT
Title: A Group-Specific Approach to NLP for Hate Speech Detection
Authors: Karina Halevy
Abstract summary: We propose a group-specific approach to NLP for online hate speech detection. We analyze historical data about discrimination against a protected group to better predict spikes in hate speech against that group. We demonstrate this approach through a case study on NLP for detection of antisemitic hate speech.
Score: 2.538209532048867
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automatic hate speech detection is an important yet complex task, requiring knowledge of common sense, stereotypes of protected groups, and histories of discrimination, each of which may constantly evolve. In this paper, we propose a group-specific approach to NLP for online hate speech detection. The approach consists of creating and infusing historical and linguistic knowledge about a particular protected group into hate speech detection models, analyzing historical data about discrimination against a protected group to better predict spikes in hate speech against that group, and critically evaluating hate speech detection models through lenses of intersectionality and ethics. We demonstrate this approach through a case study on NLP for detection of antisemitic hate speech. The case study synthesizes the current English-language literature on NLP for antisemitism detection, introduces a novel knowledge graph of antisemitic history and language from the 20th century to the present, infuses information from the knowledge graph into a set of tweets over Logistic Regression and uncased DistilBERT baselines, and suggests that incorporating context from the knowledge graph can help models pick up subtle stereotypes.

Related papers

Fine-Grained Chinese Hate Speech Understanding: Span-Level Resources, Coded Term Lexicon, and Enhanced Detection Frameworks [13.187315629074428]
We introduce the Span-level Target-Aware Toxicity Extraction dataset (STATE ToxiCN), the first span-level Chinese hate speech dataset.<n>We conduct the first comprehensive study on Chinese coded hate terms, LLMs' ability to interpret hate semantics.<n>We propose a method to integrate an annotated lexicon into models, significantly enhancing hate speech detection performance.
arXiv Detail & Related papers (2025-07-15T13:19:18Z)
HatePRISM: Policies, Platforms, and Research Integration. Advancing NLP for Hate Speech Proactive Mitigation [67.69631485036665]
We conduct a comprehensive examination of hate speech regulations and strategies from three perspectives.<n>Our findings reveal significant inconsistencies in hate speech definitions and moderation practices across jurisdictions.<n>We suggest ideas and research direction for further exploration of a unified framework for automated hate speech moderation.
arXiv Detail & Related papers (2025-07-06T11:25:23Z)
Stereotype Detection in Natural Language Processing [47.91542090964054]
Stereotypes influence social perceptions and can escalate into discrimination and violence.<n>This work is presented a survey of existing research, analyzing definitions from psychology, sociology, and philosophy.<n>Findings emphasize stereotype detection as a potential early-monitoring tool to prevent bias escalation and the rise of hate speech.
arXiv Detail & Related papers (2025-05-23T09:03:56Z)
Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study [59.30098850050971]
This work evaluates LLM prompting-based detection across eight non-English languages.<n>We show that while zero-shot and few-shot prompting lag behind fine-tuned encoder models on most of the real-world evaluation sets, they achieve better generalization on functional tests for hate speech detection.
arXiv Detail & Related papers (2025-05-09T16:00:01Z)
Hierarchical Sentiment Analysis Framework for Hate Speech Detection: Implementing Binary and Multiclass Classification Strategy [0.0]
We propose a new multitask model integrated with shared emotional representations to detect hate speech across the English language. We conclude that utilizing sentiment analysis and a Transformer-based trained model considerably improves hate speech detection across multiple datasets.
arXiv Detail & Related papers (2024-11-03T04:11:33Z)
Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles [47.61526125774749]
A dog whistle is a form of coded communication that carries a secondary meaning to specific audiences and is often weaponized for racial and socioeconomic discrimination. We present an approach for word-sense disambiguation of dog whistles from standard speech using Large Language Models (LLMs) We leverage this technique to create a dataset of 16,550 high-confidence coded examples of dog whistles used in formal and informal communication.
arXiv Detail & Related papers (2024-06-10T23:09:19Z)
An Investigation of Large Language Models for Real-World Hate Speech Detection [46.15140831710683]
A major limitation of existing methods is that hate speech detection is a highly contextual problem. Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks. Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech.
arXiv Detail & Related papers (2024-01-07T00:39:33Z)
On the Challenges of Building Datasets for Hate Speech Detection [0.0]
We first analyze the issues surrounding hate speech detection through a data-centric lens. We then outline a holistic framework to encapsulate the data creation pipeline across seven broad dimensions.
arXiv Detail & Related papers (2023-09-06T11:15:47Z)
Hate Speech Detection via Dual Contrastive Learning [25.878271501274245]
We propose a novel dual contrastive learning framework for hate speech detection. Our framework jointly optimize the self-supervised and the supervised contrastive learning loss for capturing span-level information. We conduct experiments on two publicly available English datasets, and experimental results show that the proposed model outperforms the state-of-the-art models.
arXiv Detail & Related papers (2023-07-10T13:23:36Z)
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations. We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z)
Assessing the impact of contextual information in hate speech detection [0.48369513656026514]
We provide a novel corpus for contextualized hate speech detection based on user responses to news posts from media outlets on Twitter. This corpus was collected in the Rioplatense dialectal variety of Spanish and focuses on hate speech associated with the COVID-19 pandemic.
arXiv Detail & Related papers (2022-10-02T09:04:47Z)
Deep Learning for Hate Speech Detection: A Comparative Study [54.42226495344908]
We present here a large-scale empirical comparison of deep and shallow hate-speech detection methods. Our goal is to illuminate progress in the area, and identify strengths and weaknesses in the current state-of-the-art. In doing so we aim to provide guidance as to the use of hate-speech detection in practice, quantify the state-of-the-art, and identify future research directions.
arXiv Detail & Related papers (2022-02-19T03:48:20Z)
COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences. We also propose textscCOLDetector to study output offensiveness of popular Chinese language models. Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Characterizing the adversarial vulnerability of speech self-supervised learning [95.03389072594243]
We make the first attempt to investigate the adversarial vulnerability of such paradigm under the attacks from both zero-knowledge adversaries and limited-knowledge adversaries. The experimental results illustrate that the paradigm proposed by SUPERB is seriously vulnerable to limited-knowledge adversaries.
arXiv Detail & Related papers (2021-11-08T08:44:04Z)
Latent Hatred: A Benchmark for Understanding Implicit Hate Speech [22.420275418616242]
This work introduces a theoretically-justified taxonomy of implicit hate speech and a benchmark corpus with fine-grained labels for each message. We present systematic analyses of our dataset using contemporary baselines to detect and explain implicit hate speech.
arXiv Detail & Related papers (2021-09-11T16:52:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.