The State of Profanity Obfuscation in Natural Language Processing
- URL: http://arxiv.org/abs/2210.07595v1
- Date: Fri, 14 Oct 2022 07:45:36 GMT
- Title: The State of Profanity Obfuscation in Natural Language Processing
- Authors: Debora Nozza, Dirk Hovy
- Abstract summary: obfuscating profanities makes it challenging to evaluate the content, especially for non-native speakers.
We suggest a multilingual community resource called PrOf that has a Python module to standardize profanity obfuscation processes.
- Score: 29.95449849179384
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Work on hate speech has made the consideration of rude and harmful examples
in scientific publications inevitable. This raises various problems, such as
whether or not to obscure profanities. While science must accurately disclose
what it does, the unwarranted spread of hate speech is harmful to readers, and
increases its internet frequency. While maintaining publications' professional
appearance, obfuscating profanities makes it challenging to evaluate the
content, especially for non-native speakers. Surveying 150 ACL papers, we
discovered that obfuscation is usually employed for English but not other
languages, and even so quite uneven. We discuss the problems with obfuscation
and suggest a multilingual community resource called PrOf that has a Python
module to standardize profanity obfuscation processes. We believe PrOf can help
scientific publication policies to make hate speech work accessible and
comparable, irrespective of language.
Related papers
- NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps [43.40965978436158]
Counterspeech that refutes problematic content often mentions harmful language but is not harmful itself.
We show that even recent language models fail at distinguishing use from mention.
This failure propagates to two key downstream tasks: misinformation and hate speech detection.
arXiv Detail & Related papers (2024-04-02T05:36:41Z) - Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales [15.458557611029518]
Social media platforms are a prominent arena for users to engage in interpersonal discussions and express opinions.
There arises a need to automatically identify and flag instances of hate speech.
We propose to use state-of-the-art Large Language Models (LLMs) to extract features in the form of rationales from the input text.
arXiv Detail & Related papers (2024-03-19T03:22:35Z) - An Investigation of Large Language Models for Real-World Hate Speech
Detection [46.15140831710683]
A major limitation of existing methods is that hate speech detection is a highly contextual problem.
Recently, large language models (LLMs) have demonstrated state-of-the-art performance in several natural language tasks.
Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech.
arXiv Detail & Related papers (2024-01-07T00:39:33Z) - Automatic Translation of Hate Speech to Non-hate Speech in Social Media
Texts [0.0]
We present a novel task of translating hate speech into non-hate speech text while preserving its meaning.
We provide a dataset and several baselines as a starting point for further research.
The aim of this study is to contribute to the development of more effective methods for reducing the spread of hate speech in online communities.
arXiv Detail & Related papers (2023-06-02T04:03:14Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Assessing the impact of contextual information in hate speech detection [0.48369513656026514]
We provide a novel corpus for contextualized hate speech detection based on user responses to news posts from media outlets on Twitter.
This corpus was collected in the Rioplatense dialectal variety of Spanish and focuses on hate speech associated with the COVID-19 pandemic.
arXiv Detail & Related papers (2022-10-02T09:04:47Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Addressing the Challenges of Cross-Lingual Hate Speech Detection [115.1352779982269]
In this paper we focus on cross-lingual transfer learning to support hate speech detection in low-resource languages.
We leverage cross-lingual word embeddings to train our neural network systems on the source language and apply it to the target language.
We investigate the issue of label imbalance of hate speech datasets, since the high ratio of non-hate examples compared to hate examples often leads to low model performance.
arXiv Detail & Related papers (2022-01-15T20:48:14Z) - Leveraging Multilingual Transformers for Hate Speech Detection [11.306581296760864]
We leverage state of the art Transformer language models to identify hate speech in a multilingual setting.
With a pre-trained multilingual Transformer-based text encoder at the base, we are able to successfully identify and classify hate speech from multiple languages.
arXiv Detail & Related papers (2021-01-08T20:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.