Noise Audits Improve Moral Foundation Classification
- URL: http://arxiv.org/abs/2210.07415v1
- Date: Thu, 13 Oct 2022 23:37:47 GMT
- Title: Noise Audits Improve Moral Foundation Classification
- Authors: Negar Mokhberian, Frederic R. Hopp, Bahareh Harandizadeh, Fred
Morstatter, Kristina Lerman
- Abstract summary: Morality plays an important role in culture, identity, and emotion.
Recent advances in natural language processing have shown that it is possible to classify moral values expressed in text at scale.
Morality classification relies on human annotators to label the moral expressions in text.
- Score: 5.7685650619372595
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Morality plays an important role in culture, identity, and emotion. Recent
advances in natural language processing have shown that it is possible to
classify moral values expressed in text at scale. Morality classification
relies on human annotators to label the moral expressions in text, which
provides training data to achieve state-of-the-art performance. However, these
annotations are inherently subjective and some of the instances are hard to
classify, resulting in noisy annotations due to error or lack of agreement. The
presence of noise in training data harms the classifier's ability to accurately
recognize moral foundations from text. We propose two metrics to audit the
noise of annotations. The first metric is entropy of instance labels, which is
a proxy measure of annotator disagreement about how the instance should be
labeled. The second metric is the silhouette coefficient of a label assigned by
an annotator to an instance. This metric leverages the idea that instances with
the same label should have similar latent representations, and deviations from
collective judgments are indicative of errors. Our experiments on three widely
used moral foundations datasets show that removing noisy annotations based on
the proposed metrics improves classification performance.
Related papers
- Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks [9.110872603799839]
Supervised classification heavily depends on datasets annotated by humans.
In subjective tasks such as toxicity classification, these annotations often exhibit low agreement among raters.
In this work, we propose textbfAnnotator Awares for Texts (AART) for subjective classification tasks.
arXiv Detail & Related papers (2023-11-16T10:18:32Z) - Concept-Based Explanations to Test for False Causal Relationships
Learned by Abusive Language Classifiers [7.022948483613113]
We consider three well-known abusive language classifiers trained on large English datasets.
We first examine the unwanted dependencies learned by the classifiers by assessing their accuracy on a challenge set across all decision thresholds.
We then introduce concept-based explanation metrics to assess the influence of the concept on the labels.
arXiv Detail & Related papers (2023-07-04T19:57:54Z) - RankCSE: Unsupervised Sentence Representations Learning via Learning to
Rank [54.854714257687334]
We propose a novel approach, RankCSE, for unsupervised sentence representation learning.
It incorporates ranking consistency and ranking distillation with contrastive learning into a unified framework.
An extensive set of experiments are conducted on both semantic textual similarity (STS) and transfer (TR) tasks.
arXiv Detail & Related papers (2023-05-26T08:27:07Z) - Using Natural Language Explanations to Rescale Human Judgments [81.66697572357477]
We propose a method to rescale ordinal annotations and explanations using large language models (LLMs)
We feed annotators' Likert ratings and corresponding explanations into an LLM and prompt it to produce a numeric score anchored in a scoring rubric.
Our method rescales the raw judgments without impacting agreement and brings the scores closer to human judgments grounded in the same scoring rubric.
arXiv Detail & Related papers (2023-05-24T06:19:14Z) - Measuring Fairness of Text Classifiers via Prediction Sensitivity [63.56554964580627]
ACCUMULATED PREDICTION SENSITIVITY measures fairness in machine learning models based on the model's prediction sensitivity to perturbations in input features.
We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness.
arXiv Detail & Related papers (2022-03-16T15:00:33Z) - Exploiting Context for Robustness to Label Noise in Active Learning [47.341705184013804]
We address the problems of how a system can identify which of the queried labels are wrong and how a multi-class active learning system can be adapted to minimize the negative impact of label noise.
We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available.
This is demonstrated in three different applications: scene classification, activity classification, and document classification.
arXiv Detail & Related papers (2020-10-18T18:59:44Z) - Debiased Contrastive Learning [64.98602526764599]
We develop a debiased contrastive objective that corrects for the sampling of same-label datapoints.
Empirically, the proposed objective consistently outperforms the state-of-the-art for representation learning in vision, language, and reinforcement learning benchmarks.
arXiv Detail & Related papers (2020-07-01T04:25:24Z) - Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels [98.13491369929798]
We propose a framework called Class2Simi, which transforms data points with noisy class labels to data pairs with noisy similarity labels.
Class2Simi is computationally efficient because not only this transformation is on-the-fly in mini-batches, but also it just changes loss on top of model prediction into a pairwise manner.
arXiv Detail & Related papers (2020-06-14T07:55:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.