Related papers: An Information Retrieval Approach to Building Datasets for Hate Speech Detection

An Information Retrieval Approach to Building Datasets for Hate Speech Detection

URL: http://arxiv.org/abs/2106.09775v2
Date: Mon, 21 Jun 2021 00:45:55 GMT
Title: An Information Retrieval Approach to Building Datasets for Hate Speech Detection
Authors: Md Mustafizur Rahman, Dinesh Balakrishnan, Dhiraj Murthy, Mucahid Kutlu, Matthew Lease
Abstract summary: A common practice is to only annotate tweets containing known hate words'' A second challenge is that definitions of hate speech tend to be highly variable and subjective. Our key insight is that the rarity and subjectivity of hate speech are akin to that of relevance in information retrieval (IR)
Score: 3.587367153279349
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Building a benchmark dataset for hate speech detection presents several challenges. Firstly, because hate speech is relatively rare -- e.g., less than 3\% of Twitter posts are hateful \citep{founta2018large} -- random sampling of tweets to annotate is inefficient in capturing hate speech. A common practice is to only annotate tweets containing known ``hate words'', but this risks yielding a biased benchmark that only partially captures the real-world phenomenon of interest. A second challenge is that definitions of hate speech tend to be highly variable and subjective. Annotators having diverse prior notions of hate speech may not only disagree with one another but also struggle to conform to specified labeling guidelines. Our key insight is that the rarity and subjectivity of hate speech are akin to that of relevance in information retrieval (IR). This connection suggests that well-established methodologies for creating IR test collections might also be usefully applied to create better benchmark datasets for hate speech detection. Firstly, to intelligently and efficiently select which tweets to annotate, we apply established IR techniques of {\em pooling} and {\em active learning}. Secondly, to improve both consistency and value of annotations, we apply {\em task decomposition} \cite{Zhang-sigir14} and {\em annotator rationale} \cite{mcdonnell16-hcomp} techniques. Using the above techniques, we create and share a new benchmark dataset\footnote{We will release the dataset upon publication.} for hate speech detection with broader coverage than prior datasets. We also show a dramatic drop in accuracy of existing detection models when tested on these broader forms of hate. Collected annotator rationales not only provide documented support for labeling decisions but also create exciting future work opportunities for dual-supervision and/or explanation generation in modeling.

Related papers

Towards Generalizable Generic Harmful Speech Datasets for Implicit Hate Speech Detection [7.762212551172391]
Implicit hate speech has emerged as a critical challenge for social media platforms.<n>We propose an approach to address the detection of implicit hate speech and enhance generalizability across diverse datasets.
arXiv Detail & Related papers (2025-06-19T17:23:08Z)
Dealing with Annotator Disagreement in Hate Speech Classification [0.0]
This paper examines strategies for addressing annotator disagreement, an issue that has been largely overlooked. We evaluate different approaches to deal with annotator disagreement regarding hate speech classification in Turkish tweets, based on a fine-tuned BERT model. Our work highlights the importance of the problem and provides state-of-art benchmark results for detection and understanding of hate speech in online discourse.
arXiv Detail & Related papers (2025-02-12T10:19:50Z)
Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation. We introduce novel methodologies and datasets to overcome these challenges. We propose MhBART, an encoder-decoder model designed to emulate human writing style. We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z)
Hierarchical Sentiment Analysis Framework for Hate Speech Detection: Implementing Binary and Multiclass Classification Strategy [0.0]
We propose a new multitask model integrated with shared emotional representations to detect hate speech across the English language. We conclude that utilizing sentiment analysis and a Transformer-based trained model considerably improves hate speech detection across multiple datasets.
arXiv Detail & Related papers (2024-11-03T04:11:33Z)
HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning [29.519687405350304]
We introduce a hate speech detection framework, HARE, which harnesses the reasoning capabilities of large language models (LLMs) to fill gaps in explanations of hate speech. Experiments on SBIC and Implicit Hate benchmarks show that our method, using model-generated data, consistently outperforms baselines. Our method enhances the explanation quality of trained models and improves generalization to unseen datasets.
arXiv Detail & Related papers (2023-11-01T06:09:54Z)
On the Challenges of Building Datasets for Hate Speech Detection [0.0]
We first analyze the issues surrounding hate speech detection through a data-centric lens. We then outline a holistic framework to encapsulate the data creation pipeline across seven broad dimensions.
arXiv Detail & Related papers (2023-09-06T11:15:47Z)
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations. We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z)
ToKen: Task Decomposition and Knowledge Infusion for Few-Shot Hate Speech Detection [85.68684067031909]
We frame this problem as a few-shot learning task, and show significant gains with decomposing the task into its "constituent" parts. In addition, we see that infusing knowledge from reasoning datasets (e.g. Atomic 2020) improves the performance even further.
arXiv Detail & Related papers (2022-05-25T05:10:08Z)
Anti-Asian Hate Speech Detection via Data Augmented Semantic Relation Inference [4.885207279350052]
We propose a novel approach to leverage sentiment hashtags to enhance hate speech detection in a natural language inference framework. We design a novel framework SRIC that simultaneously performs two tasks: (1) semantic relation inference between online posts and sentiment hashtags, and (2) sentiment classification on these posts.
arXiv Detail & Related papers (2022-04-14T15:03:35Z)
Deep Learning for Hate Speech Detection: A Comparative Study [54.42226495344908]
We present here a large-scale empirical comparison of deep and shallow hate-speech detection methods. Our goal is to illuminate progress in the area, and identify strengths and weaknesses in the current state-of-the-art. In doing so we aim to provide guidance as to the use of hate-speech detection in practice, quantify the state-of-the-art, and identify future research directions.
arXiv Detail & Related papers (2022-02-19T03:48:20Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Detection of Hate Speech using BERT and Hate Speech Word Embedding with Deep Model [0.5801044612920815]
This paper investigates the feasibility of leveraging domain-specific word embedding in Bidirectional LSTM based deep model to automatically detect/classify hate speech. The experiments showed that domainspecific word embedding with the Bidirectional LSTM based deep model achieved a 93% f1-score while BERT achieved up to 96% f1-score on a combined balanced dataset from available hate speech datasets.
arXiv Detail & Related papers (2021-11-02T11:42:54Z)
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation [50.55448707570669]
We propose a novel token-level, reference-free hallucination detection task and an associated annotated dataset named HaDes. To create this dataset, we first perturb a large number of text segments extracted from English language Wikipedia, and then verify these with crowd-sourced annotations.
arXiv Detail & Related papers (2021-04-18T04:09:48Z)
Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application [63.10266319378212]
We propose a method for measuring complex variables on a continuous, interval spectrum by combining supervised deep learning with the Constructing Measures approach to faceted Rasch item response theory (IRT) We demonstrate this new method on a dataset of 50,000 social media comments sourced from YouTube, Twitter, and Reddit and labeled by 11,000 U.S.-based Amazon Mechanical Turk workers.
arXiv Detail & Related papers (2020-09-22T02:15:05Z)
Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions [87.33156149634392]
We critically examine RefCOg, a standard benchmark for visual referring expression recognition. We show that 83.7% of test instances do not require reasoning on linguistic structure. We propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT.
arXiv Detail & Related papers (2020-05-04T17:09:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.