Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks
- URL: http://arxiv.org/abs/2112.07475v1
- Date: Tue, 14 Dec 2021 15:38:22 GMT
- Title: Two Contrasting Data Annotation Paradigms for Subjective NLP Tasks
- Authors: Paul R\"ottger, Bertie Vidgen, Dirk Hovy, Janet B. Pierrehumbert
- Abstract summary: We propose two contrasting paradigms for data annotation.
The descriptive paradigm encourages annotator subjectivity, whereas the prescriptive paradigm discourages it.
We argue that dataset creators should explicitly aim for one or the other to facilitate the intended use of their dataset.
- Score: 17.033055327465238
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Labelled data is the foundation of most natural language processing tasks.
However, labelling data is difficult and there often are diverse valid beliefs
about what the correct data labels should be. So far, dataset creators have
acknowledged annotator subjectivity, but not actively managed it in the
annotation process. This has led to partly-subjective datasets that fail to
serve a clear downstream use. To address this issue, we propose two contrasting
paradigms for data annotation. The descriptive paradigm encourages annotator
subjectivity, whereas the prescriptive paradigm discourages it. Descriptive
annotation allows for the surveying and modelling of different beliefs, whereas
prescriptive annotation enables the training of models that consistently apply
one belief. We discuss benefits and challenges in implementing both paradigms,
and argue that dataset creators should explicitly aim for one or the other to
facilitate the intended use of their dataset. Lastly, we design an annotation
experiment to illustrate the contrast between the two paradigms.
Related papers
- Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks [9.110872603799839]
Supervised classification heavily depends on datasets annotated by humans.
In subjective tasks such as toxicity classification, these annotations often exhibit low agreement among raters.
In this work, we propose textbfAnnotator Awares for Texts (AART) for subjective classification tasks.
arXiv Detail & Related papers (2023-11-16T10:18:32Z) - IDEAL: Influence-Driven Selective Annotations Empower In-Context
Learners in Large Language Models [66.32043210237768]
This paper introduces an influence-driven selective annotation method.
It aims to minimize annotation costs while improving the quality of in-context examples.
Experiments confirm the superiority of the proposed method on various benchmarks.
arXiv Detail & Related papers (2023-10-16T22:53:54Z) - CrowdWorkSheets: Accounting for Individual and Collective Identities
Underlying Crowdsourced Dataset Annotation [8.447159556925182]
We survey an array of literature that provides insights into ethical considerations around crowdsourced dataset annotation.
We lay out the challenges in this space along two layers: (1) who the annotator is, and how the annotators' lived experiences can impact their annotations.
We introduce a novel framework, CrowdWorkSheets, for dataset developers to facilitate transparent documentation of key decisions points at various stages of the data annotation pipeline.
arXiv Detail & Related papers (2022-06-09T23:31:17Z) - Disjoint Contrastive Regression Learning for Multi-Sourced Annotations [10.159313152511919]
Large-scale datasets are important for the development of deep learning models.
Multiple annotators may be employed to label different subsets of the data.
The inconsistency and bias among different annotators are harmful to the model training.
arXiv Detail & Related papers (2021-12-31T12:39:04Z) - Annotation Curricula to Implicitly Train Non-Expert Annotators [56.67768938052715]
voluntary studies often require annotators to familiarize themselves with the task, its annotation scheme, and the data domain.
This can be overwhelming in the beginning, mentally taxing, and induce errors into the resulting annotations.
We propose annotation curricula, a novel approach to implicitly train annotators.
arXiv Detail & Related papers (2021-06-04T09:48:28Z) - Competency Problems: On Finding and Removing Artifacts in Language Data [50.09608320112584]
We argue that for complex language understanding tasks, all simple feature correlations are spurious.
We theoretically analyze the difficulty of creating data for competency problems when human bias is taken into account.
arXiv Detail & Related papers (2021-04-17T21:34:10Z) - Contrastive Explanations for Model Interpretability [77.92370750072831]
We propose a methodology to produce contrastive explanations for classification models.
Our method is based on projecting model representation to a latent space.
Our findings shed light on the ability of label-contrastive explanations to provide a more accurate and finer-grained interpretability of a model's decision.
arXiv Detail & Related papers (2021-03-02T00:36:45Z) - Object Detection with a Unified Label Space from Multiple Datasets [94.33205773893151]
Given multiple datasets with different label spaces, the goal of this work is to train a single object detector predicting over the union of all the label spaces.
Consider an object category like faces that is annotated in one dataset, but is not annotated in another dataset.
Some categories, like face here, would thus be considered foreground in one dataset, but background in another.
We propose loss functions that carefully integrate partial but correct annotations with complementary but noisy pseudo labels.
arXiv Detail & Related papers (2020-08-15T00:51:27Z) - Null-sampling for Interpretable and Fair Representations [8.654168514863649]
We learn invariant representations, in the data domain, to achieve interpretability in algorithmic fairness.
By placing the representations into the data domain, the changes made by the model are easily examinable by human auditors.
arXiv Detail & Related papers (2020-08-12T11:49:01Z) - Evaluating Models' Local Decision Boundaries via Contrast Sets [119.38387782979474]
We propose a new annotation paradigm for NLP that helps to close systematic gaps in the test data.
We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets.
Although our contrast sets are not explicitly adversarial, model performance is significantly lower on them than on the original test sets.
arXiv Detail & Related papers (2020-04-06T14:47:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.