Related papers: Decreasing Annotation Burden of Pairwise Comparisons with Human-in-the-Loop Sorting: Application in Medical Image Artifact Rating

Decreasing Annotation Burden of Pairwise Comparisons with Human-in-the-Loop Sorting: Application in Medical Image Artifact Rating

URL: http://arxiv.org/abs/2202.04823v1
Date: Thu, 10 Feb 2022 04:02:45 GMT
Title: Decreasing Annotation Burden of Pairwise Comparisons with Human-in-the-Loop Sorting: Application in Medical Image Artifact Rating
Authors: Ikbeom Jang, Garrison Danley, Ken Chang, Jayashree Kalpathy-Cramer
Abstract summary: Ranking by pairwise comparisons has shown improved reliability over ordinal classification. We propose a method for reducing the number of pairwise comparisons required to rank by a quantitative metric.
Score: 3.5314411880556063
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Ranking by pairwise comparisons has shown improved reliability over ordinal classification. However, as the annotations of pairwise comparisons scale quadratically, this becomes less practical when the dataset is large. We propose a method for reducing the number of pairwise comparisons required to rank by a quantitative metric, demonstrating the effectiveness of the approach in ranking medical images by image quality in this proof of concept study. Using the medical image annotation software that we developed, we actively subsample pairwise comparisons using a sorting algorithm with a human rater in the loop. We find that this method substantially reduces the number of comparisons required for a full ordinal ranking without compromising inter-rater reliability when compared to pairwise comparisons without sorting.

Related papers

EZ-Sort: Efficient Pairwise Comparison via Zero-Shot CLIP-Based Pre-Ordering and Human-in-the-Loop Sorting [0.5861893391102716]
Pairwise comparison is often favored over absolute rating or ordinal classification in subjective or difficult annotation tasks.<n>Recent work has greatly reduced the annotation burden (O(n log n)) by actively sampling pairwise comparisons using a sorting algorithm.<n>We further improve annotation efficiency by (1) roughly pre-ordering items using the Contrastive Language-Image Pre-training (CLIP) model hierarchically without training, and (2) replacing easy, obvious human comparisons with automated comparisons.
arXiv Detail & Related papers (2025-08-29T12:06:49Z)
When Less Is More: Binary Feedback Can Outperform Ordinal Comparisons in Ranking Recovery [4.451479907610764]
We propose a general parametric framework for modeling ordinal paired comparisons without ties.<n>We show that binarizing ordinal data can significantly improve the accuracy of ranking recovery.
arXiv Detail & Related papers (2025-07-02T11:30:38Z)
Crowdsourcing subjective annotations using pairwise comparisons reduces bias and error compared to the majority-vote method [0.0]
We introduce a theoretical framework for understanding how random error and measurement bias enter into crowdsourced annotations of subjective constructs. We then propose a pipeline that combines pairwise comparison labelling with Elo scoring, and demonstrate that it outperforms the ubiquitous majority-voting method in reducing both types of measurement error.
arXiv Detail & Related papers (2023-05-31T17:14:12Z)
Learning by Sorting: Self-supervised Learning with Group Ordering Constraints [75.89238437237445]
This paper proposes a new variation of the contrastive learning objective, Group Ordering Constraints (GroCo) It exploits the idea of sorting the distances of positive and negative pairs and computing the respective loss based on how many positive pairs have a larger distance than the negative pairs, and thus are not ordered correctly. We evaluate the proposed formulation on various self-supervised learning benchmarks and show that it not only leads to improved results compared to vanilla contrastive learning but also shows competitive performance to comparable methods in linear probing and outperforms current methods in k-NN performance.
arXiv Detail & Related papers (2023-01-05T11:17:55Z)
A Revenue Function for Comparison-Based Hierarchical Clustering [5.683072566711975]
We propose a new revenue function that allows one to measure the goodness of dendrograms using only comparisons. We show that this function is closely related to Dasgupta's cost for hierarchical clustering that uses pairwise similarities. On the theoretical side, we use the proposed revenue function to resolve the open problem of whether one can approximately recover a latent hierarchy using few triplet comparisons.
arXiv Detail & Related papers (2022-11-29T18:40:02Z)
Efficient computation of rankings from pairwise comparisons [0.0]
We describe an alternative and similarly simple iteration that provably returns identical results but does so much faster. We demonstrate this algorithm with applications to a range of example data sets and derive a number of results regarding its convergence.
arXiv Detail & Related papers (2022-06-30T19:39:09Z)
Contextual Similarity Aggregation with Self-attention for Visual Re-ranking [96.55393026011811]
We propose a visual re-ranking method by contextual similarity aggregation with self-attention. We conduct comprehensive experiments on four benchmark datasets to demonstrate the generality and effectiveness of our proposed visual re-ranking method.
arXiv Detail & Related papers (2021-10-26T06:20:31Z)
Adaptive Sampling for Heterogeneous Rank Aggregation from Noisy Pairwise Comparisons [85.5955376526419]
In rank aggregation problems, users exhibit various accuracy levels when comparing pairs of items. We propose an elimination-based active sampling strategy, which estimates the ranking of items via noisy pairwise comparisons. We prove that our algorithm can return the true ranking of items with high probability.
arXiv Detail & Related papers (2021-10-08T13:51:55Z)
Recall@k Surrogate Loss with Large Batches and Similarity Mixup [62.67458021725227]
Direct optimization, by gradient descent, of an evaluation metric is not possible when it is non-differentiable. In this work, a differentiable surrogate loss for the recall is proposed. The proposed method achieves state-of-the-art results in several image retrieval benchmarks.
arXiv Detail & Related papers (2021-08-25T11:09:11Z)
Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z)
Pointwise Binary Classification with Pairwise Confidence Comparisons [97.79518780631457]
We propose pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other. We link Pcomp classification to noisy-label learning to develop a progressive URE and improve it by imposing consistency regularization.
arXiv Detail & Related papers (2020-10-05T09:23:58Z)
Active Sampling for Pairwise Comparisons via Approximate Message Passing and Information Gain Maximization [5.771869590520189]
We propose ASAP, an active sampling algorithm based on approximate message passing and expected information gain. We show that ASAP offers the highest accuracy of inferred scores compared to the existing methods.
arXiv Detail & Related papers (2020-04-12T20:48:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.