Parsimonious Argument Annotations for Hate Speech Counter-narratives
- URL: http://arxiv.org/abs/2208.01099v1
- Date: Mon, 1 Aug 2022 18:58:32 GMT
- Title: Parsimonious Argument Annotations for Hate Speech Counter-narratives
- Authors: Damian A. Furman, Pablo Torres, Jose A. Rodriguez, Lautaro Martinez,
Laura Alonso Alemany, Diego Letzen, Maria Vanina Martinez
- Abstract summary: We present an enrichment of the Hateval corpus of hate speech tweets (Basile et. al.) aimed to facilitate automated counter-narrative generation.
We have also annotated tweets with argumentative information based on Wagemanns, that we believe can help in building convincing and effective counter-narratives for hate speech against particular groups.
Preliminary results show that automatic annotators perform close to human annotators to detect some aspects of argumentation, while others only reach low or moderate level of inter-annotator agreement.
- Score: 4.825848785596437
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We present an enrichment of the Hateval corpus of hate speech tweets (Basile
et. al 2019) aimed to facilitate automated counter-narrative generation.
Comparably to previous work (Chung et. al. 2019), manually written
counter-narratives are associated to tweets. However, this information alone
seems insufficient to obtain satisfactory language models for counter-narrative
generation. That is why we have also annotated tweets with argumentative
information based on Wagemanns (2016), that we believe can help in building
convincing and effective counter-narratives for hate speech against particular
groups.
We discuss adequacies and difficulties of this annotation process and present
several baselines for automatic detection of the annotated elements.
Preliminary results show that automatic annotators perform close to human
annotators to detect some aspects of argumentation, while others only reach low
or moderate level of inter-annotator agreement.
Related papers
- Consolidating Strategies for Countering Hate Speech Using Persuasive
Dialogues [3.8979646385036175]
We explore controllable strategies for generating counter-arguments to hateful comments in online conversations.
Using automatic and human evaluations, we determine the best combination of features that generate fluent, argumentative, and logically sound arguments.
We share developed computational models for automatically annotating text with such features, and a silver-standard annotated version of an existing hate speech dialog corpora.
arXiv Detail & Related papers (2024-01-15T16:31:18Z) - DisCGen: A Framework for Discourse-Informed Counterspeech Generation [34.75404551612012]
We propose a framework based on theories of discourse to study the inferential links that connect counter speeches to hateful comments.
We present a process for collecting an in-the-wild dataset of counterspeech from Reddit.
We show that by using our dataset and framework, large language models can generate contextually-grounded counterspeech informed by theories of discourse.
arXiv Detail & Related papers (2023-11-29T23:20:17Z) - HARE: Explainable Hate Speech Detection with Step-by-Step Reasoning [29.519687405350304]
We introduce a hate speech detection framework, HARE, which harnesses the reasoning capabilities of large language models (LLMs) to fill gaps in explanations of hate speech.
Experiments on SBIC and Implicit Hate benchmarks show that our method, using model-generated data, consistently outperforms baselines.
Our method enhances the explanation quality of trained models and improves generalization to unseen datasets.
arXiv Detail & Related papers (2023-11-01T06:09:54Z) - When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks [45.14664901245331]
A crucial problem in hate speech detection is determining whether a statement is offensive to a demographic group.
We construct a model that predicts individual annotator ratings on potentially offensive text.
We find that annotator ratings can be predicted using their demographic information and opinions on online content.
arXiv Detail & Related papers (2023-05-11T07:55:20Z) - Controllable Mixed-Initiative Dialogue Generation through Prompting [50.03458333265885]
Mixed-initiative dialogue tasks involve repeated exchanges of information and conversational control.
Agents gain control by generating responses that follow particular dialogue intents or strategies, prescribed by a policy planner.
Standard approach has been fine-tuning pre-trained language models to perform generation conditioned on these intents.
We instead prompt large language models as a drop-in replacement to fine-tuning on conditional generation.
arXiv Detail & Related papers (2023-05-06T23:11:25Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - SpeechLMScore: Evaluating speech generation using speech language model [43.20067175503602]
We propose SpeechLMScore, an unsupervised metric to evaluate generated speech using a speech-language model.
It does not require human annotation and is a highly scalable framework.
Evaluation results demonstrate that the proposed metric shows a promising correlation with human evaluation scores on different speech generation tasks.
arXiv Detail & Related papers (2022-12-08T21:00:15Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - An Information Retrieval Approach to Building Datasets for Hate Speech
Detection [3.587367153279349]
A common practice is to only annotate tweets containing known hate words''
A second challenge is that definitions of hate speech tend to be highly variable and subjective.
Our key insight is that the rarity and subjectivity of hate speech are akin to that of relevance in information retrieval (IR)
arXiv Detail & Related papers (2021-06-17T19:25:39Z) - Countering hate on social media: Large scale classification of hate and
counter speech [0.0]
Hateful rhetoric is plaguing online discourse, fostering extreme societal movements and possibly giving rise to real-world violence.
A potential solution is citizen-generated counter speech where citizens actively engage in hate-filled conversations to attempt to restore civil non-polarized discourse.
Here we made use of a unique situation in Germany where self-labeling groups engaged in organized online hate and counter speech.
We used an ensemble learning algorithm which pairs a variety of paragraph embeddings with regularized logistic regression functions to classify both hate and counter speech in a corpus of millions of relevant tweets from these two groups.
arXiv Detail & Related papers (2020-06-02T23:12:52Z) - Learning an Unreferenced Metric for Online Dialogue Evaluation [53.38078951628143]
We propose an unreferenced automated evaluation metric that uses large pre-trained language models to extract latent representations of utterances.
We show that our model achieves higher correlation with human annotations in an online setting, while not requiring true responses for comparison during inference.
arXiv Detail & Related papers (2020-05-01T20:01:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.