Related papers: Toxicity Detection: Does Context Really Matter?

Toxicity Detection: Does Context Really Matter?

URL: http://arxiv.org/abs/2006.00998v1
Date: Mon, 1 Jun 2020 15:03:48 GMT
Title: Toxicity Detection: Does Context Really Matter?
Authors: John Pavlopoulos and Jeffrey Sorensen and Lucas Dixon and Nithum Thain and Ion Androutsopoulos
Abstract summary: We find that context can amplify or mitigate the perceived toxicity of posts. Surprisingly, we also find no evidence that context actually improves the performance of toxicity classifiers. This points to the need for larger datasets of comments annotated in context.
Score: 22.083682201142242
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Moderation is crucial to promoting healthy on-line discussions. Although several `toxicity' detection datasets and models have been published, most of them ignore the context of the posts, implicitly assuming that comments maybe judged independently. We investigate this assumption by focusing on two questions: (a) does context affect the human judgement, and (b) does conditioning on context improve performance of toxicity detection systems? We experiment with Wikipedia conversations, limiting the notion of context to the previous post in the thread and the discussion title. We find that context can both amplify or mitigate the perceived toxicity of posts. Moreover, a small but significant subset of manually labeled posts (5% in one of our experiments) end up having the opposite toxicity labels if the annotators are not provided with context. Surprisingly, we also find no evidence that context actually improves the performance of toxicity classifiers, having tried a range of classifiers and mechanisms to make them context aware. This points to the need for larger datasets of comments annotated in context. We make our code and data publicly available.

Related papers

On the Role of Speech Data in Reducing Toxicity Detection Bias [22.44133159647888]
We produce a set of high-quality group annotations for the multilingual MuTox dataset. We then leverage these annotations to systematically compare speech- and text-based toxicity classifiers. Our findings indicate that access to speech data during inference supports reduced bias against group mentions.
arXiv Detail & Related papers (2024-11-12T19:26:43Z)
BiasX: "Thinking Slow" in Toxic Content Moderation with Explanations of Implied Social Biases [28.519851740902258]
BiasX is a framework that enhances content moderation setups with free-text explanations of statements' implied social biases. We show that participants substantially benefit from explanations for correctly identifying subtly (non-)toxic content. Our results showcase the promise of using free-text explanations to encourage more thoughtful toxicity moderation.
arXiv Detail & Related papers (2023-05-23T01:45:18Z)
Constructing Highly Inductive Contexts for Dialogue Safety through Controllable Reverse Generation [65.48908724440047]
We propose a method called emphreverse generation to construct adversarial contexts conditioned on a given response. We test three popular pretrained dialogue models (Blender, DialoGPT, and Plato2) and find that BAD+ can largely expose their safety problems.
arXiv Detail & Related papers (2022-12-04T12:23:41Z)
Hate Speech and Counter Speech Detection: Conversational Context Does Matter [7.333666276087548]
This paper investigates the role of conversational context in the annotation and detection of online hate and counter speech. We created a context-aware dataset for a 3-way classification task on Reddit comments: hate speech, counter speech, or neutral.
arXiv Detail & Related papers (2022-06-13T19:05:44Z)
Beyond Plain Toxic: Detection of Inappropriate Statements on Flammable Topics for the Russian Language [76.58220021791955]
We present two text collections labelled according to binary notion of inapropriateness and a multinomial notion of sensitive topic. To objectivise the notion of inappropriateness, we define it in a data-driven way though crowdsourcing.
arXiv Detail & Related papers (2022-03-04T15:59:06Z)
Toxicity Detection can be Sensitive to the Conversational Context [64.28043776806213]
We construct and publicly release a dataset of 10,000 posts with two kinds of toxicity labels. We introduce a new task, context sensitivity estimation, which aims to identify posts whose perceived toxicity changes if the context is also considered.
arXiv Detail & Related papers (2021-11-19T13:57:26Z)
Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company's Reputation [64.22895450493729]
A calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities. We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness.
arXiv Detail & Related papers (2021-03-09T10:50:30Z)
Challenges in Automated Debiasing for Toxic Language Detection [81.04406231100323]
Biased associations have been a challenge in the development of classifiers for detecting toxic language. We investigate recently introduced debiasing methods for text classification datasets and models, as applied to toxic language detection. Our focus is on lexical (e.g., swear words, slurs, identity mentions) and dialectal markers (specifically African American English)
arXiv Detail & Related papers (2021-01-29T22:03:17Z)
Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias [113.44471186752018]
Existing models often leverage co-occurrences between objects and their context to improve recognition accuracy. This work focuses on addressing such contextual biases to improve the robustness of the learnt feature representations.
arXiv Detail & Related papers (2020-01-09T18:31:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.