Related papers: Quantifying and Attributing Polarization to Annotator Groups

Quantifying and Attributing Polarization to Annotator Groups

URL: http://arxiv.org/abs/2602.06055v1
Date: Fri, 16 Jan 2026 12:32:12 GMT
Title: Quantifying and Attributing Polarization to Annotator Groups
Authors: Dimitris Tsirmpas, John Pavlopoulos,
Abstract summary: Polarization is strongly and persistently attributed to annotator race, especially on the hate speech task.<n>Less educated annotators are more subjective, while educated ones tend to broadly agree more between themselves.
Score: 6.194291632696817
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Current annotation agreement metrics are not well-suited for inter-group analysis, are sensitive to group size imbalances and restricted to single-annotation settings. These restrictions render them insufficient for many subjective tasks such as toxicity and hate-speech detection. For this reason, we introduce a quantifiable metric, paired with a statistical significance test, that attributes polarization to various annotator groups. Our metric enables direct comparisons between heavily imbalanced sociodemographic and ideological subgroups across different datasets and tasks, while also enabling analysis on multi-label settings. We apply this metric to three datasets on hate speech, and one on toxicity detection, discovering that: (1) Polarization is strongly and persistently attributed to annotator race, especially on the hate speech task. (2) Religious annotators do not fundamentally disagree with each other, but do with other annotators, a trend that is gradually diminished and then reversed with irreligious annotators. (3) Less educated annotators are more subjective, while educated ones tend to broadly agree more between themselves. Overall, our results reflect current findings around annotation patterns for various subgroups. Finally, we estimate the minimum number of annotators needed to obtain robust results, and provide an open-source Python library that implements our metric.

Related papers

Towards Generalizable Generic Harmful Speech Datasets for Implicit Hate Speech Detection [7.762212551172391]
Implicit hate speech has emerged as a critical challenge for social media platforms.<n>We propose an approach to address the detection of implicit hate speech and enhance generalizability across diverse datasets.
arXiv Detail & Related papers (2025-06-19T17:23:08Z)
Mitigating Subgroup Disparities in Multi-Label Speech Emotion Recognition: A Pseudo-Labeling and Unsupervised Learning Approach [53.824673312331626]
Implicit Demography Inference (IDI) module uses k-means clustering to mitigate bias in Speech Emotion Recognition (SER)<n>Experiments show that pseudo-labeling IDI reduces subgroup disparities, improving fairness metrics by over 28%.<n>Unsupervised IDI yields more than a 4.6% improvement in fairness metrics with a drop of less than 3.6% in SER performance.
arXiv Detail & Related papers (2025-05-20T14:50:44Z)
How many classifiers do we need? [50.69951049206484]
We provide a detailed analysis of how the disagreement and the polarization among classifiers relate to the performance gain achieved by aggregating individual classifiers. We prove results for the behavior of the disagreement in terms of the number of classifiers. Our theories and claims are supported by empirical results on several image classification tasks with various types of neural networks.
arXiv Detail & Related papers (2024-11-01T02:59:56Z)
Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks [9.110872603799839]
Supervised classification heavily depends on datasets annotated by humans. In subjective tasks such as toxicity classification, these annotations often exhibit low agreement among raters. In this work, we propose textbfAnnotator Awares for Texts (AART) for subjective classification tasks.
arXiv Detail & Related papers (2023-11-16T10:18:32Z)
ACTOR: Active Learning with Annotator-specific Classification Heads to Embrace Human Label Variation [35.10805667891489]
Active learning, as an annotation cost-saving strategy, has not been fully explored in the context of learning from disagreement. We show that in the active learning setting, a multi-head model performs significantly better than a single-head model in terms of uncertainty estimation.
arXiv Detail & Related papers (2023-10-23T14:26:43Z)
Using Natural Language Explanations to Rescale Human Judgments [81.66697572357477]
We propose a method to rescale ordinal annotations and explanations using large language models (LLMs)<n>We feed annotators' Likert ratings and corresponding explanations into an LLM and prompt it to produce a numeric score anchored in a scoring rubric.<n>Our method rescales the raw judgments without impacting agreement and brings the scores closer to human judgments grounded in the same scoring rubric.
arXiv Detail & Related papers (2023-05-24T06:19:14Z)
When the Majority is Wrong: Modeling Annotator Disagreement for Subjective Tasks [45.14664901245331]
A crucial problem in hate speech detection is determining whether a statement is offensive to a demographic group. We construct a model that predicts individual annotator ratings on potentially offensive text. We find that annotator ratings can be predicted using their demographic information and opinions on online content.
arXiv Detail & Related papers (2023-05-11T07:55:20Z)
Investigating User Radicalization: A Novel Dataset for Identifying Fine-Grained Temporal Shifts in Opinion [7.028604573959653]
We introduce an innovative annotated dataset for modeling subtle opinion fluctuations and detecting fine-grained stances. The dataset includes a sufficient amount of stance polarity and intensity labels per user over time and within entire conversational threads. All posts are annotated by non-experts and a significant portion of the data is also annotated by experts.
arXiv Detail & Related papers (2022-04-16T09:31:25Z)
Towards Group Robustness in the presence of Partial Group Labels [61.33713547766866]
spurious correlations between input samples and the target labels wrongly direct the neural network predictions. We propose an algorithm that optimize for the worst-off group assignments from a constraint set. We show improvements in the minority group's performance while preserving overall aggregate accuracy across groups.
arXiv Detail & Related papers (2022-01-10T22:04:48Z)
Reducing Target Group Bias in Hate Speech Detectors [56.94616390740415]
We show that text classification models trained on large publicly available datasets, may significantly under-perform on several protected groups. We propose to perform token-level hate sense disambiguation, and utilize tokens' hate sense representations for detection.
arXiv Detail & Related papers (2021-12-07T17:49:34Z)
Constructing interval variables via faceted Rasch measurement and multitask deep learning: a hate speech application [63.10266319378212]
We propose a method for measuring complex variables on a continuous, interval spectrum by combining supervised deep learning with the Constructing Measures approach to faceted Rasch item response theory (IRT) We demonstrate this new method on a dataset of 50,000 social media comments sourced from YouTube, Twitter, and Reddit and labeled by 11,000 U.S.-based Amazon Mechanical Turk workers.
arXiv Detail & Related papers (2020-09-22T02:15:05Z)
Contrastive Examples for Addressing the Tyranny of the Majority [83.93825214500131]
We propose to create a balanced training dataset, consisting of the original dataset plus new data points in which the group memberships are intervened. We show that current generative adversarial networks are a powerful tool for learning these data points, called contrastive examples.
arXiv Detail & Related papers (2020-04-14T14:06:44Z)

This list is automatically generated from the titles and abstracts of the papers in this site.