Self-Supervised Euphemism Detection and Identification for Content
Moderation
- URL: http://arxiv.org/abs/2103.16808v1
- Date: Wed, 31 Mar 2021 04:52:38 GMT
- Title: Self-Supervised Euphemism Detection and Identification for Content
Moderation
- Authors: Wanzheng Zhu, Hongyu Gong, Rohan Bansal, Zachary Weinberg, Nicolas
Christin, Giulia Fanti, Suma Bhat
- Abstract summary: One common use of euphemisms is to evade content moderation policies enforced by social media platforms.
It is usually apparent to a human moderator that a word is being used euphemistically, but they may not know what the secret meaning is.
This paper will demonstrate unsupervised algorithms that can both detect words being used euphemistically, and identify the secret meaning of each word.
- Score: 16.322965299627974
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fringe groups and organizations have a long history of using
euphemisms--ordinary-sounding words with a secret meaning--to conceal what they
are discussing. Nowadays, one common use of euphemisms is to evade content
moderation policies enforced by social media platforms. Existing tools for
enforcing policy automatically rely on keyword searches for words on a "ban
list", but these are notoriously imprecise: even when limited to swearwords,
they can still cause embarrassing false positives. When a commonly used
ordinary word acquires a euphemistic meaning, adding it to a keyword-based ban
list is hopeless: consider "pot" (storage container or marijuana?) or "heater"
(household appliance or firearm?) The current generation of social media
companies instead hire staff to check posts manually, but this is expensive,
inhumane, and not much more effective. It is usually apparent to a human
moderator that a word is being used euphemistically, but they may not know what
the secret meaning is, and therefore whether the message violates policy. Also,
when a euphemism is banned, the group that used it need only invent another
one, leaving moderators one step behind.
This paper will demonstrate unsupervised algorithms that, by analyzing words
in their sentence-level context, can both detect words being used
euphemistically, and identify the secret meaning of each word. Compared to the
existing state of the art, which uses context-free word embeddings, our
algorithm for detecting euphemisms achieves 30-400% higher detection accuracies
of unlabeled euphemisms in a text corpus. Our algorithm for revealing
euphemistic meanings of words is the first of its kind, as far as we are aware.
In the arms race between content moderators and policy evaders, our algorithms
may help shift the balance in the direction of the moderators.
Related papers
- Impromptu Cybercrime Euphemism Detection [20.969469059941545]
We introduce the Impromptu Cybercrime Euphemisms Detection dataset.
We propose a detection framework tailored to this problem.
Our approach achieves a remarkable 76-fold improvement compared to the previous state-of-the-art euphemism detector.
arXiv Detail & Related papers (2024-12-02T11:56:06Z) - Bridging Dictionary: AI-Generated Dictionary of Partisan Language Use [21.15400893251543]
Bridging Dictionary is an interactive tool designed to illuminate how words are perceived by people with different political views.
The Bridging Dictionary includes a static, printable document featuring 796 terms with summaries generated by a large language model.
Users can explore selected words, visualizing their frequency, sentiment, summaries, and examples across political divides.
arXiv Detail & Related papers (2024-07-12T19:44:40Z) - Biomedical Named Entity Recognition via Dictionary-based Synonym
Generalization [51.89486520806639]
We propose a novel Synonym Generalization (SynGen) framework that recognizes the biomedical concepts contained in the input text using span-based predictions.
We extensively evaluate our approach on a wide range of benchmarks and the results verify that SynGen outperforms previous dictionary-based models by notable margins.
arXiv Detail & Related papers (2023-05-22T14:36:32Z) - Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z) - Euphemistic Phrase Detection by Masked Language Model [9.49544185939481]
We perform phrase mining on a social media corpus to extract quality phrases.
Then, we utilize word embedding similarities to select a set of euphemistic phrase candidates.
We report 20-50% higher detection accuracies using our algorithm for detecting euphemistic phrases.
arXiv Detail & Related papers (2021-09-10T04:57:30Z) - Semantic-Preserving Adversarial Text Attacks [85.32186121859321]
We propose a Bigram and Unigram based adaptive Semantic Preservation Optimization (BU-SPO) method to examine the vulnerability of deep models.
Our method achieves the highest attack success rates and semantics rates by changing the smallest number of words compared with existing methods.
arXiv Detail & Related papers (2021-08-23T09:05:18Z) - Fluent: An AI Augmented Writing Tool for People who Stutter [47.10916891482696]
People who stutter (PWS) may adopt different strategies to conceal their stuttering.
One of the common strategies is word substitution where an individual avoids saying a word they might stutter on and use an alternative instead.
In this work, we present Fluent, an AI augmented writing tool which assists PWS in writing scripts which they can speak more fluently.
arXiv Detail & Related papers (2021-08-23T04:08:27Z) - Towards Dark Jargon Interpretation in Underground Forums [37.15748678894555]
We present a novel method towards automatically identifying and interpreting dark jargons.
We formalize the problem as a mapping from dark words to "clean" words with no hidden meaning.
Our method makes use of interpretable representations of dark and clean words in the form of probability distributions over a shared vocabulary.
arXiv Detail & Related papers (2020-11-05T18:08:32Z) - Speakers Fill Lexical Semantic Gaps with Context [65.08205006886591]
We operationalise the lexical ambiguity of a word as the entropy of meanings it can take.
We find significant correlations between our estimate of ambiguity and the number of synonyms a word has in WordNet.
This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.
arXiv Detail & Related papers (2020-10-05T17:19:10Z) - The Grievance Dictionary: Understanding Threatening Language Use [0.8373151777137792]
The Grievance Dictionary can be used to automatically understand language use in the context of grievance-fuelled violence threat assessment.
The dictionary was validated by applying it to texts written by violent and non-violent individuals.
arXiv Detail & Related papers (2020-09-10T12:06:48Z) - Techniques for Vocabulary Expansion in Hybrid Speech Recognition Systems [54.49880724137688]
The problem of out of vocabulary words (OOV) is typical for any speech recognition system.
One of the popular approach to cover OOVs is to use subword units rather then words.
In this paper we explore different existing methods of this solution on both graph construction and search method levels.
arXiv Detail & Related papers (2020-03-19T21:24:45Z) - Humpty Dumpty: Controlling Word Meanings via Corpus Poisoning [29.181547214915238]
We show that an attacker can control the "meaning" of new and existing words by changing their locations in the embedding space.
An attack on the embedding can affect diverse downstream tasks, demonstrating for the first time the power of data poisoning in transfer learning scenarios.
arXiv Detail & Related papers (2020-01-14T17:48:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.