Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on
Toxicity Annotation
- URL: http://arxiv.org/abs/2205.00501v1
- Date: Sun, 1 May 2022 16:08:48 GMT
- Title: Is Your Toxicity My Toxicity? Exploring the Impact of Rater Identity on
Toxicity Annotation
- Authors: Nitesh Goyal, Ian Kivlichan, Rachel Rosen, Lucy Vasserman
- Abstract summary: We study how raters' self-described identities impact how they annotate toxicity in online comments.
We found that rater identity is a statistically significant factor in how raters will annotate toxicity for identity-related annotations.
We trained models on the annotations from each of the different rater pools, and compared the scores of these models on comments from several test sets.
- Score: 1.1699472346137738
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine learning models are commonly used to detect toxicity in online
conversations. These models are trained on datasets annotated by human raters.
We explore how raters' self-described identities impact how they annotate
toxicity in online comments. We first define the concept of specialized rater
pools: rater pools formed based on raters' self-described identities, rather
than at random. We formed three such rater pools for this study--specialized
rater pools of raters from the U.S. who identify as African American, LGBTQ,
and those who identify as neither. Each of these rater pools annotated the same
set of comments, which contains many references to these identity groups. We
found that rater identity is a statistically significant factor in how raters
will annotate toxicity for identity-related annotations. Using preliminary
content analysis, we examined the comments with the most disagreement between
rater pools and found nuanced differences in the toxicity annotations. Next, we
trained models on the annotations from each of the different rater pools, and
compared the scores of these models on comments from several test sets.
Finally, we discuss how using raters that self-identify with the subjects of
comments can create more inclusive machine learning models, and provide more
nuanced ratings than those by random raters.
Related papers
- Unveiling Social Media Comments with a Novel Named Entity Recognition System for Identity Groups [2.5849042763002426]
We develop a Named Entity Recognition (NER) System for Identity Groups.
Our tool not only detects whether a sentence contains an attack but also tags the sentence tokens corresponding to the mentioned group.
We tested the utility of our tool in a case study on social media, annotating and comparing comments from Facebook related to news mentioning identity groups.
arXiv Detail & Related papers (2024-05-13T19:33:18Z) - Modeling subjectivity (by Mimicking Annotator Annotation) in toxic
comment identification across diverse communities [3.0284081180864675]
This study aims to identify intuitive variances from annotator disagreement using quantitative analysis.
We also evaluate the model's ability to mimic diverse viewpoints on toxicity by varying size of the training data.
We conclude that subjectivity is evident across all annotator groups, demonstrating the shortcomings of majority-rule voting.
arXiv Detail & Related papers (2023-11-01T00:17:11Z) - Privacy Assessment on Reconstructed Images: Are Existing Evaluation
Metrics Faithful to Human Perception? [86.58989831070426]
We study the faithfulness of hand-crafted metrics to human perception of privacy information from reconstructed images.
We propose a learning-based measure called SemSim to evaluate the Semantic Similarity between the original and reconstructed images.
arXiv Detail & Related papers (2023-09-22T17:58:04Z) - Gender Biases in Automatic Evaluation Metrics for Image Captioning [87.15170977240643]
We conduct a systematic study of gender biases in model-based evaluation metrics for image captioning tasks.
We demonstrate the negative consequences of using these biased metrics, including the inability to differentiate between biased and unbiased generations.
We present a simple and effective way to mitigate the metric bias without hurting the correlations with human judgments.
arXiv Detail & Related papers (2023-05-24T04:27:40Z) - Toxic Comments Hunter : Score Severity of Toxic Comments [0.0]
In this experiment, we collect various data sets related to toxic comments.
Because of the characteristics of comment data, we perform data cleaning and feature extraction operations on it.
In terms of model construction, we used the training set to train the models based on TFIDF and finetuned the Bert model.
arXiv Detail & Related papers (2022-02-15T07:35:52Z) - Annotators with Attitudes: How Annotator Beliefs And Identities Bias
Toxic Language Detection [75.54119209776894]
We investigate the effect of annotator identities (who) and beliefs (why) on toxic language annotations.
We consider posts with three characteristics: anti-Black language, African American English dialect, and vulgarity.
Our results show strong associations between annotator identity and beliefs and their ratings of toxicity.
arXiv Detail & Related papers (2021-11-15T18:58:20Z) - SS-BERT: Mitigating Identity Terms Bias in Toxic Comment Classification
by Utilising the Notion of "Subjectivity" and "Identity Terms" [6.2384249607204]
We propose a novel approach to tackle such bias in toxic comment classification.
We hypothesize that when a comment is made about a group of people that is characterized by an identity term, the likelihood of that comment being toxic is associated with the subjectivity level of the comment.
arXiv Detail & Related papers (2021-09-06T18:40:06Z) - Mitigating Biases in Toxic Language Detection through Invariant
Rationalization [70.36701068616367]
biases toward some attributes, including gender, race, and dialect, exist in most training datasets for toxicity detection.
We propose to use invariant rationalization (InvRat), a game-theoretic framework consisting of a rationale generator and a predictor, to rule out the spurious correlation of certain syntactic patterns.
Our method yields lower false positive rate in both lexical and dialectal attributes than previous debiasing methods.
arXiv Detail & Related papers (2021-06-14T08:49:52Z) - Fully Unsupervised Person Re-identification viaSelective Contrastive
Learning [58.5284246878277]
Person re-identification (ReID) aims at searching the same identity person among images captured by various cameras.
We propose a novel selective contrastive learning framework for unsupervised feature learning.
Experimental results demonstrate the superiority of our method in unsupervised person ReID compared with the state-of-the-arts.
arXiv Detail & Related papers (2020-10-15T09:09:23Z) - Learning Person Re-identification Models from Videos with Weak
Supervision [53.53606308822736]
We introduce the problem of learning person re-identification models from videos with weak supervision.
We propose a multiple instance attention learning framework for person re-identification using such video-level labels.
The attention weights are obtained based on all person images instead of person tracklets in a video, making our learned model less affected by noisy annotations.
arXiv Detail & Related papers (2020-07-21T07:23:32Z) - Reading Between the Demographic Lines: Resolving Sources of Bias in
Toxicity Classifiers [0.0]
Perspective API is perhaps the most widely used toxicity classifier in industry.
Google's model tends to unfairly assign higher toxicity scores to comments containing words referring to the identities of commonly targeted groups.
We have constructed several toxicity classifiers with the intention of reducing unintended bias while maintaining strong classification performance.
arXiv Detail & Related papers (2020-06-29T21:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.