Related papers: Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language

Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language

URL: http://arxiv.org/abs/2311.00161v1
Date: Tue, 31 Oct 2023 21:33:46 GMT
Title: Beyond Denouncing Hate: Strategies for Countering Implied Biases and Stereotypes in Language
Authors: Jimin Mun, Emily Allaway, Akhila Yerukola, Laura Vianna, Sarah-Jane Leslie, Maarten Sap
Abstract summary: We draw from psychology and philosophy literature to craft six psychologically inspired strategies to challenge the underlying stereotypical implications of hateful language. We show that human-written counterspeech uses strategies that are more specific to the implied stereotype, whereas machine-generated counterspeech uses less specific strategies. Our findings point to the importance of accounting for the underlying stereotypical implications of speech when generating counterspeech.
Score: 18.560379338032558
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Counterspeech, i.e., responses to counteract potential harms of hateful speech, has become an increasingly popular solution to address online hate speech without censorship. However, properly countering hateful language requires countering and dispelling the underlying inaccurate stereotypes implied by such language. In this work, we draw from psychology and philosophy literature to craft six psychologically inspired strategies to challenge the underlying stereotypical implications of hateful language. We first examine the convincingness of each of these strategies through a user study, and then compare their usages in both human- and machine-generated counterspeech datasets. Our results show that human-written counterspeech uses countering strategies that are more specific to the implied stereotype (e.g., counter examples to the stereotype, external factors about the stereotype's origins), whereas machine-generated counterspeech uses less specific strategies (e.g., generally denouncing the hatefulness of speech). Furthermore, machine-generated counterspeech often employs strategies that humans deem less convincing compared to human-produced counterspeech. Our findings point to the importance of accounting for the underlying stereotypical implications of speech when generating counterspeech and for better machine reasoning about anti-stereotypical examples.

Related papers

Echoes of Discord: Forecasting Hater Reactions to Counterspeech [10.658005418397748]
This study analyzes the impact of counterspeech from the hater's perspective. We employ two strategies: a two-stage reaction predictor and a three-way classification model. Experimental results demonstrate that the 3-way classification model outperforms the two-stage reaction predictor.
arXiv Detail & Related papers (2025-01-27T17:33:38Z)
Generative AI may backfire for counterspeech [20.57872238271025]
We analyze whether contextualized counterspeech generated by state-of-the-art AI is effective in curbing online hate speech. We find that non-contextualized counterspeech employing a warning-of-consequence strategy significantly reduces online hate speech. However, contextualized counterspeech generated by LLMs proves ineffective and may even backfire.
arXiv Detail & Related papers (2024-11-22T14:47:00Z)
Assessing the Human Likeness of AI-Generated Counterspeech [10.434435022492723]
Counterspeech is a targeted response to counteract and challenge abusive or hateful content. Previous studies have proposed different strategies for automatically generated counterspeech. We investigate the human likeness of AI-generated counterspeech, a critical factor influencing effectiveness.
arXiv Detail & Related papers (2024-10-14T18:48:47Z)
Hatred Stems from Ignorance! Distillation of the Persuasion Modes in Countering Conversational Hate Speech [0.0]
This study distills persuasion modes into reason, emotion, and credibility. It evaluates their use in two types of conversation interactions: closed (multi-turn) and open (single-turn) concerning racism, sexism, and religious bigotry.
arXiv Detail & Related papers (2024-03-18T07:20:35Z)
An Investigation of Large Language Models for Real-World Hate Speech Detection [46.15140831710683]
A major limitation of existing methods is that hate speech detection is a highly contextual problem. Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks. Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech.
arXiv Detail & Related papers (2024-01-07T00:39:33Z)
ZET-Speech: Zero-shot adaptive Emotion-controllable Text-to-Speech Synthesis with Diffusion and Style-based Models [83.07390037152963]
ZET-Speech is a zero-shot adaptive emotion-controllable TTS model. It allows users to synthesize any speaker's emotional speech using only a short, neutral speech segment and the target emotion label. Experimental results demonstrate that ZET-Speech successfully synthesizes natural and emotional speech with the desired emotion for both seen and unseen speakers.
arXiv Detail & Related papers (2023-05-23T08:52:00Z)
CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations. We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z)
Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale [61.555788332182395]
We investigate the potential for machine learning models to amplify dangerous and complex stereotypes. We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupations, or objects.
arXiv Detail & Related papers (2022-11-07T18:31:07Z)
Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages. We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language. We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z)
Hate Speech Classifiers Learn Human-Like Social Stereotypes [4.132204773132937]
Social stereotypes negatively impact individuals' judgements about different groups. Social stereotypes may have a critical role in how people understand language directed toward minority social groups.
arXiv Detail & Related papers (2021-10-28T01:35:41Z)
Impact and dynamics of hate and counter speech online [0.0]
Citizen-generated counter speech is a promising way to fight hate speech and promote peaceful, non-polarized discourse. We analyze 180,000 political conversations that took place on German Twitter over four years.
arXiv Detail & Related papers (2020-09-16T01:43:28Z)
A Framework for the Computational Linguistic Analysis of Dehumanization [52.735780962665814]
We analyze discussions of LGBTQ people in the New York Times from 1986 to 2015. We find increasingly humanizing descriptions of LGBTQ people over time. The ability to analyze dehumanizing language at a large scale has implications for automatically detecting and understanding media bias as well as abusive language online.
arXiv Detail & Related papers (2020-03-06T03:02:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.