Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language
- URL: http://arxiv.org/abs/2203.02392v1
- Date: Fri, 4 Mar 2022 15:59:06 GMT
- Title: Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable
Topics for the Russian Language
- Authors: Nikolay Babakov, Varvara Logacheva, Alexander Panchenko
- Abstract summary: We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic.
To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
- Score: 76.58220021791955
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Toxicity on the Internet, such as hate speech, offenses towards particular
users or groups of people, or the use of obscene words, is an acknowledged
problem. However, there also exist other types of inappropriate messages which
are usually not viewed as toxic, e.g. as they do not contain explicit offences.
Such messages can contain covered toxicity or generalizations, incite harmful
actions (crime, suicide, drug use), provoke "heated" discussions. Such messages
are often related to particular sensitive topics, e.g. on politics, sexual
minorities, social injustice which more often than other topics, e.g. cars or
computing, yield toxic emotional reactions. At the same time, clearly not all
messages within such flammable topics are inappropriate.
Towards this end, in this work, we present two text collections labelled
according to binary notion of inapropriateness and a multinomial notion of
sensitive topic. Assuming that the notion of inappropriateness is common among
people of the same culture, we base our approach on human intuitive
understanding of what is not acceptable and harmful. To objectivise the notion
of inappropriateness, we define it in a data-driven way though crowdsourcing.
Namely we run a large-scale annotation study asking workers if a given chatbot
textual statement could harm reputation of a company created it. Acceptably
high values of inter-annotator agreement suggest that the notion of
inappropriateness exists and can be uniformly understood by different people.
To define the notion of sensitive topics in an objective way we use on
guidelines suggested commonly by specialists of legal and PR department of a
large public company as potentially harmful.
Related papers
- Analyzing Toxicity in Deep Conversations: A Reddit Case Study [0.0]
This work employs a tree-based approach to understand how users behave concerning toxicity in public conversation settings.
We collect both the posts and the comment sections of the top 100 posts from 8 Reddit communities that allow profanity, totaling over 1 million responses.
We find that toxic comments increase the likelihood of subsequent toxic comments being produced in online conversations.
arXiv Detail & Related papers (2024-04-11T16:10:44Z) - Subjective $\textit{Isms}$? On the Danger of Conflating Hate and Offence
in Abusive Language Detection [5.351398116822836]
We argue that the conflation of hate and offence can invalidate findings on hate speech.
We call for future work to be situated in theory, disentangling hate from its concept, offence.
arXiv Detail & Related papers (2024-03-04T17:56:28Z) - Analyzing Norm Violations in Live-Stream Chat [49.120561596550395]
We study the first NLP study dedicated to detecting norm violations in conversations on live-streaming platforms.
We define norm violation categories in live-stream chats and annotate 4,583 moderated comments from Twitch.
Our results show that appropriate contextual information can boost moderation performance by 35%.
arXiv Detail & Related papers (2023-05-18T05:58:27Z) - Classification of social media Toxic comments using Machine learning
models [0.0]
The abstract outlines the problem of toxic comments on social media platforms, where individuals use disrespectful, abusive, and unreasonable language.
This behavior is referred to as anti-social behavior, which occurs during online debates, comments, and fights.
The comments containing explicit language can be classified into various categories, such as toxic, severe toxic, obscene, threat, insult, and identity hate.
To protect users from offensive language, companies have started flagging comments and blocking users.
arXiv Detail & Related papers (2023-04-14T05:40:11Z) - CoSyn: Detecting Implicit Hate Speech in Online Conversations Using a
Context Synergized Hyperbolic Network [52.85130555886915]
CoSyn is a context-synergized neural network that explicitly incorporates user- and conversational context for detecting implicit hate speech in online conversations.
We show that CoSyn outperforms all our baselines in detecting implicit hate speech with absolute improvements in the range of 1.24% - 57.8%.
arXiv Detail & Related papers (2023-03-02T17:30:43Z) - Is the Elephant Flying? Resolving Ambiguities in Text-to-Image
Generative Models [64.58271886337826]
We study ambiguities that arise in text-to-image generative models.
We propose a framework to mitigate ambiguities in the prompts given to the systems by soliciting clarifications from the user.
arXiv Detail & Related papers (2022-11-17T17:12:43Z) - Enriching Abusive Language Detection with Community Context [0.3708656266586145]
Use of pejorative expressions can be benign or actively empowering.
Models for abuse detection misclassify these expressions as derogatory, inadvertently censor productive conversations held by marginalized groups.
Our paper highlights how community context can improve classification outcomes in abusive language detection.
arXiv Detail & Related papers (2022-06-16T20:54:02Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - Detecting Inappropriate Messages on Sensitive Topics that Could Harm a
Company's Reputation [64.22895450493729]
A calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities.
We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness.
arXiv Detail & Related papers (2021-03-09T10:50:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.