Aligning Intraobserver Agreement by Transitivity
- URL: http://arxiv.org/abs/2009.13905v1
- Date: Tue, 29 Sep 2020 09:55:04 GMT
- Title: Aligning Intraobserver Agreement by Transitivity
- Authors: Jacopo Amidei
- Abstract summary: We propose a novel method for measuring within annotator consistency or annotator Intraobserver Agreement (IA)
The proposed approach is based on transitivity, a measure that has been thoroughly studied in the context of rational decision-making.
- Score: 1.0152838128195467
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Annotation reproducibility and accuracy rely on good consistency within
annotators. We propose a novel method for measuring within annotator
consistency or annotator Intraobserver Agreement (IA). The proposed approach is
based on transitivity, a measure that has been thoroughly studied in the
context of rational decision-making. The transitivity measure, in contrast with
the commonly used test-retest strategy for annotator IA, is less sensitive to
the several types of bias introduced by the test-retest strategy. We present a
representation theorem to the effect that relative judgement data that meet
transitivity can be mapped to a scale (in terms of measurement theory). We also
discuss a further application of transitivity as part of data collection design
for addressing the problem of the quadratic complexity of data collection of
relative judgements.
Related papers
- Unsupervised Transfer Learning via Adversarial Contrastive Training [3.227277661633986]
We propose a novel unsupervised transfer learning approach using adversarial contrastive training (ACT)
Our experimental results demonstrate outstanding classification accuracy with both fine-tuned linear probe and K-NN protocol across various datasets.
arXiv Detail & Related papers (2024-08-16T05:11:52Z) - Predictive Performance Test based on the Exhaustive Nested Cross-Validation for High-dimensional data [7.62566998854384]
Cross-validation is used for several tasks such as estimating the prediction error, tuning the regularization parameter, and selecting the most suitable predictive model.
The K-fold cross-validation is a popular CV method but its limitation is that the risk estimates are highly dependent on the partitioning of the data.
This study presents an alternative novel predictive performance test and valid confidence intervals based on exhaustive nested cross-validation.
arXiv Detail & Related papers (2024-08-06T12:28:16Z) - Rethinking Affect Analysis: A Protocol for Ensuring Fairness and Consistency [24.737468736951374]
We propose a unified protocol for database partitioning that ensures fairness and comparability.
We provide detailed demographic annotations (in terms of race, gender and age), evaluation metrics, and a common framework for expression recognition.
We also rerun the methods with the new protocol and introduce a new leaderboards to encourage future research in affect recognition with a fairer comparison.
arXiv Detail & Related papers (2024-08-04T23:21:46Z) - Efficient Conformal Prediction under Data Heterogeneity [79.35418041861327]
Conformal Prediction (CP) stands out as a robust framework for uncertainty quantification.
Existing approaches for tackling non-exchangeability lead to methods that are not computable beyond the simplest examples.
This work introduces a new efficient approach to CP that produces provably valid confidence sets for fairly general non-exchangeable data distributions.
arXiv Detail & Related papers (2023-12-25T20:02:51Z) - On the Universal Adversarial Perturbations for Efficient Data-free
Adversarial Detection [55.73320979733527]
We propose a data-agnostic adversarial detection framework, which induces different responses between normal and adversarial samples to UAPs.
Experimental results show that our method achieves competitive detection performance on various text classification tasks.
arXiv Detail & Related papers (2023-06-27T02:54:07Z) - Information-Theoretic Bias Reduction via Causal View of Spurious
Correlation [71.9123886505321]
We propose an information-theoretic bias measurement technique through a causal interpretation of spurious correlation.
We present a novel debiasing framework against the algorithmic bias, which incorporates a bias regularization loss.
The proposed bias measurement and debiasing approaches are validated in diverse realistic scenarios.
arXiv Detail & Related papers (2022-01-10T01:19:31Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - WSSOD: A New Pipeline for Weakly- and Semi-Supervised Object Detection [75.80075054706079]
We propose a weakly- and semi-supervised object detection framework (WSSOD)
An agent detector is first trained on a joint dataset and then used to predict pseudo bounding boxes on weakly-annotated images.
The proposed framework demonstrates remarkable performance on PASCAL-VOC and MSCOCO benchmark, achieving a high performance comparable to those obtained in fully-supervised settings.
arXiv Detail & Related papers (2021-05-21T11:58:50Z) - Evaluation of Unsupervised Entity and Event Salience Estimation [17.74208462902158]
Salience Estimation aims to predict term importance in documents.
Previous studies typically generate pseudo-ground truth for evaluation.
In this work, we propose a light yet practical entity and event salience estimation evaluation protocol.
arXiv Detail & Related papers (2021-04-14T15:23:08Z) - A Statistical Analysis of Summarization Evaluation Metrics using
Resampling Methods [60.04142561088524]
We find that the confidence intervals are rather wide, demonstrating high uncertainty in how reliable automatic metrics truly are.
Although many metrics fail to show statistical improvements over ROUGE, two recent works, QAEval and BERTScore, do in some evaluation settings.
arXiv Detail & Related papers (2021-03-31T18:28:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.