Related papers: Aligning Intraobserver Agreement by Transitivity

Aligning Intraobserver Agreement by Transitivity

URL: http://arxiv.org/abs/2009.13905v1
Date: Tue, 29 Sep 2020 09:55:04 GMT
Title: Aligning Intraobserver Agreement by Transitivity
Authors: Jacopo Amidei
Abstract summary: We propose a novel method for measuring within annotator consistency or annotator Intraobserver Agreement (IA) The proposed approach is based on transitivity, a measure that has been thoroughly studied in the context of rational decision-making.
Score: 1.0152838128195467
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Annotation reproducibility and accuracy rely on good consistency within annotators. We propose a novel method for measuring within annotator consistency or annotator Intraobserver Agreement (IA). The proposed approach is based on transitivity, a measure that has been thoroughly studied in the context of rational decision-making. The transitivity measure, in contrast with the commonly used test-retest strategy for annotator IA, is less sensitive to the several types of bias introduced by the test-retest strategy. We present a representation theorem to the effect that relative judgement data that meet transitivity can be mapped to a scale (in terms of measurement theory). We also discuss a further application of transitivity as part of data collection design for addressing the problem of the quadratic complexity of data collection of relative judgements.

Related papers

Unsupervised Transfer Learning via Adversarial Contrastive Training [3.227277661633986]
We propose a novel unsupervised transfer learning approach using adversarial contrastive training (ACT) Our experimental results demonstrate outstanding classification accuracy with both fine-tuned linear probe and K-NN protocol across various datasets.
arXiv Detail & Related papers (2024-08-16T05:11:52Z)
Predictive Performance Test based on the Exhaustive Nested Cross-Validation for High-dimensional data [7.62566998854384]
Cross-validation is used for several tasks such as estimating the prediction error, tuning the regularization parameter, and selecting the most suitable predictive model. The K-fold cross-validation is a popular CV method but its limitation is that the risk estimates are highly dependent on the partitioning of the data. This study presents an alternative novel predictive performance test and valid confidence intervals based on exhaustive nested cross-validation.
arXiv Detail & Related papers (2024-08-06T12:28:16Z)
Rethinking Affect Analysis: A Protocol for Ensuring Fairness and Consistency [24.737468736951374]
We propose a unified protocol for database partitioning that ensures fairness and comparability. We provide detailed demographic annotations (in terms of race, gender and age), evaluation metrics, and a common framework for expression recognition. We also rerun the methods with the new protocol and introduce a new leaderboards to encourage future research in affect recognition with a fairer comparison.
arXiv Detail & Related papers (2024-08-04T23:21:46Z)
Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification. Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples. This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z)
On the Universal Adversarial Perturbations for Efficient Data-free Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs. Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z)
Information-Theoretic Bias Reduction via Causal View of Spurious Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation. We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss. The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z)
Learning Bias-Invariant Representation by Cross-Sample Mutual Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task. The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator. We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z)
WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD) An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images. The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z)
Evaluation of Unsupervised Entity and Event Salience Estimation [17.74208462902158]
Salience Estimation aims to predict term importance in documents. Previous studies typically generate pseudo-ground truth for evaluation. In this work, we propose a light yet practical entity and event salience estimation evaluation protocol.
arXiv Detail & Related papers (2021-04-14T15:23:08Z)
A Statistical Analysis of Summarization Evaluation Metrics using Resampling Methods [60.04142561088524]
We find that the confidence intervals are rather wide, demonstrating high uncertainty in how reliable automatic metrics truly are. Although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do in some evaluation settings.
arXiv Detail & Related papers (2021-03-31T18:28:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.