A Collaborative Content Moderation Framework for Toxicity Detection based on Conformalized Estimates of Annotation Disagreement
- URL: http://arxiv.org/abs/2411.04090v2
- Date: Thu, 07 Nov 2024 07:12:45 GMT
- Title: A Collaborative Content Moderation Framework for Toxicity Detection based on Conformalized Estimates of Annotation Disagreement
- Authors: Guillermo Villate-Castillo, Javier Del Ser, Borja Sanz,
- Abstract summary: We introduce a novel content moderation framework that emphasizes the importance of capturing annotation disagreement.
We leverage uncertainty estimation techniques, specifically Conformal Prediction, to account for both the ambiguity in comment annotations and the model's inherent uncertainty in predicting toxicity and disagreement.
- Score: 7.345136916791223
- License:
- Abstract: Content moderation typically combines the efforts of human moderators and machine learning models. However, these systems often rely on data where significant disagreement occurs during moderation, reflecting the subjective nature of toxicity perception. Rather than dismissing this disagreement as noise, we interpret it as a valuable signal that highlights the inherent ambiguity of the content,an insight missed when only the majority label is considered. In this work, we introduce a novel content moderation framework that emphasizes the importance of capturing annotation disagreement. Our approach uses multitask learning, where toxicity classification serves as the primary task and annotation disagreement is addressed as an auxiliary task. Additionally, we leverage uncertainty estimation techniques, specifically Conformal Prediction, to account for both the ambiguity in comment annotations and the model's inherent uncertainty in predicting toxicity and disagreement.The framework also allows moderators to adjust thresholds for annotation disagreement, offering flexibility in determining when ambiguity should trigger a review. We demonstrate that our joint approach enhances model performance, calibration, and uncertainty estimation, while offering greater parameter efficiency and improving the review process in comparison to single-task methods.
Related papers
- On Subjective Uncertainty Quantification and Calibration in Natural Language Generation [2.622066970118316]
Large language models often involve the generation of free-form responses, in which case uncertainty quantification becomes challenging.
This work addresses these challenges from a perspective of Bayesian decision theory.
We discuss how this assumption enables principled quantification of the model's subjective uncertainty and its calibration.
The proposed methods can be applied to black-box language models.
arXiv Detail & Related papers (2024-06-07T18:54:40Z) - Interpretable Automatic Fine-grained Inconsistency Detection in Text
Summarization [56.94741578760294]
We propose the task of fine-grained inconsistency detection, the goal of which is to predict the fine-grained types of factual errors in a summary.
Motivated by how humans inspect factual inconsistency in summaries, we propose an interpretable fine-grained inconsistency detection model, FineGrainFact.
arXiv Detail & Related papers (2023-05-23T22:11:47Z) - Uncertain Facial Expression Recognition via Multi-task Assisted
Correction [43.02119884581332]
We propose a novel method of multi-task assisted correction in addressing uncertain facial expression recognition called MTAC.
Specifically, a confidence estimation block and a weighted regularization module are applied to highlight solid samples and suppress uncertain samples in every batch.
Experiments on RAF-DB, AffectNet, and AffWild2 datasets demonstrate that the MTAC obtains substantial improvements over baselines when facing synthetic and real uncertainties.
arXiv Detail & Related papers (2022-12-14T10:28:08Z) - The Implicit Delta Method [61.36121543728134]
In this paper, we propose an alternative, the implicit delta method, which works by infinitesimally regularizing the training loss of uncertainty.
We show that the change in the evaluation due to regularization is consistent for the variance of the evaluation estimator, even when the infinitesimal change is approximated by a finite difference.
arXiv Detail & Related papers (2022-11-11T19:34:17Z) - Fairness and robustness in anti-causal prediction [73.693135253335]
Robustness to distribution shift and fairness have independently emerged as two important desiderata required of machine learning models.
While these two desiderata seem related, the connection between them is often unclear in practice.
By taking this perspective, we draw explicit connections between a common fairness criterion - separation - and a common notion of robustness.
arXiv Detail & Related papers (2022-09-20T02:41:17Z) - Dealing with Disagreements: Looking Beyond the Majority Vote in
Subjective Annotations [6.546195629698355]
We investigate the efficacy of multi-annotator models for subjective tasks.
We show that this approach yields same or better performance than aggregating labels in the data prior to training.
Our approach also provides a way to estimate uncertainty in predictions, which we demonstrate better correlate with annotation disagreements than traditional methods.
arXiv Detail & Related papers (2021-10-12T03:12:34Z) - Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty
Estimation for Facial Expression Recognition [59.52434325897716]
We propose a solution, named DMUE, to address the problem of annotation ambiguity from two perspectives.
For the former, an auxiliary multi-branch learning framework is introduced to better mine and describe the latent distribution in the label space.
For the latter, the pairwise relationship of semantic feature between instances are fully exploited to estimate the ambiguity extent in the instance space.
arXiv Detail & Related papers (2021-04-01T03:21:57Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z) - Uncertainty as a Form of Transparency: Measuring, Communicating, and
Using Uncertainty [66.17147341354577]
We argue for considering a complementary form of transparency by estimating and communicating the uncertainty associated with model predictions.
We describe how uncertainty can be used to mitigate model unfairness, augment decision-making, and build trustworthy systems.
This work constitutes an interdisciplinary review drawn from literature spanning machine learning, visualization/HCI, design, decision-making, and fairness.
arXiv Detail & Related papers (2020-11-15T17:26:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.