CounterGeDi: A controllable approach to generate polite, detoxified and
emotional counterspeech
- URL: http://arxiv.org/abs/2205.04304v1
- Date: Mon, 9 May 2022 14:10:57 GMT
- Title: CounterGeDi: A controllable approach to generate polite, detoxified and
emotional counterspeech
- Authors: Punyajoy Saha, Kanishk Singh, Adarsh Kumar, Binny Mathew and Animesh
Mukherjee
- Abstract summary: We propose CounterGeDi to guide the generation of a DialoGPT model toward more polite, detoxified, and emotionally laden counterspeech.
We generate counterspeech using three datasets and observe significant improvement across different attribute scores.
- Score: 7.300229659237878
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, many studies have tried to create generation models to assist
counter speakers by providing counterspeech suggestions for combating the
explosive proliferation of online hate. However, since these suggestions are
from a vanilla generation model, they might not include the appropriate
properties required to counter a particular hate speech instance. In this
paper, we propose CounterGeDi - an ensemble of generative discriminators (GeDi)
to guide the generation of a DialoGPT model toward more polite, detoxified, and
emotionally laden counterspeech. We generate counterspeech using three datasets
and observe significant improvement across different attribute scores. The
politeness and detoxification scores increased by around 15% and 6%
respectively, while the emotion in the counterspeech increased by at least 10%
across all the datasets. We also experiment with triple-attribute control and
observe significant improvement over single attribute results when combining
complementing attributes, e.g., politeness, joyfulness and detoxification. In
all these experiments, the relevancy of the generated text does not deteriorate
due to the application of these controls
Related papers
- A Target-Aware Analysis of Data Augmentation for Hate Speech Detection [3.858155067958448]
Hate speech is one of the main threats posed by the widespread use of social networks.
We investigate the possibility of augmenting existing data with generative language models, reducing target imbalance.
For some hate categories such as origin, religion, and disability, hate speech classification using augmented data for training improves by more than 10% F1 over the no augmentation baseline.
arXiv Detail & Related papers (2024-10-10T15:46:27Z) - CrowdCounter: A benchmark type-specific multi-target counterspeech dataset [10.133642589954192]
We introduce a new dataset - CrowdCounter containing 3,425 hate speech-counterspeech pairs.
The design of our annotation platform itself encourages annotators to write type-specific, non-redundant and high-quality counterspeech.
We evaluate two frameworks for generating counterspeech responses - vanilla and type-controlled prompts.
arXiv Detail & Related papers (2024-10-02T10:24:51Z) - Towards Unsupervised Speech Recognition Without Pronunciation Models [57.222729245842054]
Most languages lack sufficient paired speech and text data to effectively train automatic speech recognition systems.
We propose the removal of reliance on a phoneme lexicon to develop unsupervised ASR systems.
We experimentally demonstrate that an unsupervised speech recognizer can emerge from joint speech-to-speech and text-to-text masked token-infilling.
arXiv Detail & Related papers (2024-06-12T16:30:58Z) - Outcome-Constrained Large Language Models for Countering Hate Speech [10.434435022492723]
This study aims to develop methods for generating counterspeech constrained by conversation outcomes.
We experiment with large language models (LLMs) to incorporate into the text generation process two desired conversation outcomes.
Evaluation results show that our methods effectively steer the generation of counterspeech toward the desired outcomes.
arXiv Detail & Related papers (2024-03-25T19:44:06Z) - DisCGen: A Framework for Discourse-Informed Counterspeech Generation [34.75404551612012]
We propose a framework based on theories of discourse to study the inferential links that connect counter speeches to hateful comments.
We present a process for collecting an in-the-wild dataset of counterspeech from Reddit.
We show that by using our dataset and framework, large language models can generate contextually-grounded counterspeech informed by theories of discourse.
arXiv Detail & Related papers (2023-11-29T23:20:17Z) - Leveraging Implicit Feedback from Deployment Data in Dialogue [83.02878726357523]
We study improving social conversational agents by learning from natural dialogue between users and a deployed model.
We leverage signals like user response length, sentiment and reaction of the future human utterances in the collected dialogue episodes.
arXiv Detail & Related papers (2023-07-26T11:34:53Z) - Paraphrasing evades detectors of AI-generated text, but retrieval is an
effective defense [56.077252790310176]
We present a paraphrase generation model (DIPPER) that can paraphrase paragraphs, condition on surrounding context, and control lexical diversity and content reordering.
Using DIPPER to paraphrase text generated by three large language models (including GPT3.5-davinci-003) successfully evades several detectors, including watermarking.
We introduce a simple defense that relies on retrieving semantically-similar generations and must be maintained by a language model API provider.
arXiv Detail & Related papers (2023-03-23T16:29:27Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Generate, Prune, Select: A Pipeline for Counterspeech Generation against
Online Hate Speech [9.49544185939481]
Off-the-shelf Natural Language Generation (NLG) methods are limited in that they generate commonplace, repetitive and safe responses.
In this paper, we design a three- module pipeline approach to effectively improve the diversity and relevance.
Our proposed pipeline first generates various counterspeech candidates by a generative model to promote diversity, then filters the ungrammatical ones using a BERT model, and finally selects the most relevant counterspeech response.
arXiv Detail & Related papers (2021-06-03T06:54:03Z) - You Do Not Need More Data: Improving End-To-End Speech Recognition by
Text-To-Speech Data Augmentation [59.31769998728787]
We build our TTS system on an ASR training database and then extend the data with synthesized speech to train a recognition model.
Our system establishes a competitive result for end-to-end ASR trained on LibriSpeech train-clean-100 set with WER 4.3% for test-clean and 13.5% for test-other.
arXiv Detail & Related papers (2020-05-14T17:24:57Z) - Multi-task self-supervised learning for Robust Speech Recognition [75.11748484288229]
This paper proposes PASE+, an improved version of PASE for robust speech recognition in noisy and reverberant environments.
We employ an online speech distortion module, that contaminates the input signals with a variety of random disturbances.
We then propose a revised encoder that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks.
arXiv Detail & Related papers (2020-01-25T00:24:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.