HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model
for online comments
- URL: http://arxiv.org/abs/2312.13193v1
- Date: Wed, 20 Dec 2023 17:05:46 GMT
- Title: HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model
for online comments
- Authors: Neeraj Kumar Singh, Koyel Ghosh, Joy Mahapatra, Utpal Garain,
Apurbalal Senapati
- Abstract summary: We propose a novel end-to-end model, HCDIR, for Hate Context Detection, and Hate Intensity Reduction in social media posts.
We fine-tuned several pre-trained language models to detect hateful comments to ascertain the best-performing hateful comments detection model.
- Score: 2.162419921663162
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Warning: This paper contains examples of the language that some people may
find offensive.
Detecting and reducing hateful, abusive, offensive comments is a critical and
challenging task on social media. Moreover, few studies aim to mitigate the
intensity of hate speech. While studies have shown that context-level semantics
are crucial for detecting hateful comments, most of this research focuses on
English due to the ample datasets available. In contrast, low-resource
languages, like Indian languages, remain under-researched because of limited
datasets. Contrary to hate speech detection, hate intensity reduction remains
unexplored in high-resource and low-resource languages. In this paper, we
propose a novel end-to-end model, HCDIR, for Hate Context Detection, and Hate
Intensity Reduction in social media posts. First, we fine-tuned several
pre-trained language models to detect hateful comments to ascertain the
best-performing hateful comments detection model. Then, we identified the
contextual hateful words. Identification of such hateful words is justified
through the state-of-the-art explainable learning model, i.e., Integrated
Gradient (IG). Lastly, the Masked Language Modeling (MLM) model has been
employed to capture domain-specific nuances to reduce hate intensity. We masked
the 50\% hateful words of the comments identified as hateful and predicted the
alternative words for these masked terms to generate convincing sentences. An
optimal replacement for the original hate comments from the feasible sentences
is preferred. Extensive experiments have been conducted on several recent
datasets using automatic metric-based evaluation (BERTScore) and thorough human
evaluation. To enhance the faithfulness in human evaluation, we arranged a
group of three human annotators with varied expertise.
Related papers
- K-HATERS: A Hate Speech Detection Corpus in Korean with Target-Specific
Ratings [6.902524826065157]
K-HATERS is a new corpus for hate speech detection in Korean, comprising approximately 192K news comments with target-specific offensiveness ratings.
This study contributes to the NLP research on hate speech detection and resource construction.
arXiv Detail & Related papers (2023-10-24T01:20:05Z) - HateRephrase: Zero- and Few-Shot Reduction of Hate Intensity in Online
Posts using Large Language Models [4.9711707739781215]
This paper investigates an approach of suggesting a rephrasing of potential hate speech content even before the post is made.
We develop 4 different prompts based on task description, hate definition, few-shot demonstrations and chain-of-thoughts.
We find that GPT-3.5 outperforms the baseline and open-source models for all the different kinds of prompts.
arXiv Detail & Related papers (2023-10-21T12:18:29Z) - Model-Agnostic Meta-Learning for Multilingual Hate Speech Detection [23.97444551607624]
Hate speech in social media is a growing phenomenon, and detecting such toxic content has gained significant traction.
HateMAML is a model-agnostic meta-learning-based framework that effectively performs hate speech detection in low-resource languages.
Extensive experiments are conducted on five datasets across eight different low-resource languages.
arXiv Detail & Related papers (2023-03-04T22:28:29Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Assessing the impact of contextual information in hate speech detection [0.48369513656026514]
We provide a novel corpus for contextualized hate speech detection based on user responses to news posts from media outlets on Twitter.
This corpus was collected in the Rioplatense dialectal variety of Spanish and focuses on hate speech associated with the COVID-19 pandemic.
arXiv Detail & Related papers (2022-10-02T09:04:47Z) - Improved two-stage hate speech classification for twitter based on Deep
Neural Networks [0.0]
Hate speech is a form of online harassment that involves the use of abusive language.
The model we propose in this work is an extension of an existing approach based on LSTM neural network architectures.
Our study includes a performance comparison of several proposed alternative methods for the second stage evaluated on a public corpus of 16k tweets.
arXiv Detail & Related papers (2022-06-08T20:57:41Z) - Deep Learning for Hate Speech Detection: A Comparative Study [54.42226495344908]
We present here a large-scale empirical comparison of deep and shallow hate-speech detection methods.
Our goal is to illuminate progress in the area, and identify strengths and weaknesses in the current state-of-the-art.
In doing so we aim to provide guidance as to the use of hate-speech detection in practice, quantify the state-of-the-art, and identify future research directions.
arXiv Detail & Related papers (2022-02-19T03:48:20Z) - COLD: A Benchmark for Chinese Offensive Language Detection [54.60909500459201]
We use COLDataset, a Chinese offensive language dataset with 37k annotated sentences.
We also propose textscCOLDetector to study output offensiveness of popular Chinese language models.
Our resources and analyses are intended to help detoxify the Chinese online communities and evaluate the safety performance of generative language models.
arXiv Detail & Related papers (2022-01-16T11:47:23Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of
Language Models [86.02610674750345]
Adversarial GLUE (AdvGLUE) is a new multi-task benchmark to explore and evaluate the vulnerabilities of modern large-scale language models under various types of adversarial attacks.
We apply 14 adversarial attack methods to GLUE tasks to construct AdvGLUE, which is further validated by humans for reliable annotations.
All the language models and robust training methods we tested perform poorly on AdvGLUE, with scores lagging far behind the benign accuracy.
arXiv Detail & Related papers (2021-11-04T12:59:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.