Crowd-Certain: Label Aggregation in Crowdsourced and Ensemble Learning
Classification
- URL: http://arxiv.org/abs/2310.16293v1
- Date: Wed, 25 Oct 2023 01:58:37 GMT
- Title: Crowd-Certain: Label Aggregation in Crowdsourced and Ensemble Learning
Classification
- Authors: Mohammad S. Majdi and Jeffrey J. Rodriguez
- Abstract summary: We introduce Crowd-Certain, a novel approach for label aggregation in crowdsourced and ensemble learning classification tasks.
The proposed method uses the consistency of the annotators versus a trained classifier to determine a reliability score for each annotator.
We extensively evaluated our approach against ten existing techniques across ten different datasets, each labeled by varying numbers of annotators.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Crowdsourcing systems have been used to accumulate massive amounts of labeled
data for applications such as computer vision and natural language processing.
However, because crowdsourced labeling is inherently dynamic and uncertain,
developing a technique that can work in most situations is extremely
challenging. In this paper, we introduce Crowd-Certain, a novel approach for
label aggregation in crowdsourced and ensemble learning classification tasks
that offers improved performance and computational efficiency for different
numbers of annotators and a variety of datasets. The proposed method uses the
consistency of the annotators versus a trained classifier to determine a
reliability score for each annotator. Furthermore, Crowd-Certain leverages
predicted probabilities, enabling the reuse of trained classifiers on future
sample data, thereby eliminating the need for recurrent simulation processes
inherent in existing methods. We extensively evaluated our approach against ten
existing techniques across ten different datasets, each labeled by varying
numbers of annotators. The findings demonstrate that Crowd-Certain outperforms
the existing methods (Tao, Sheng, KOS, MACE, MajorityVote, MMSR, Wawa,
Zero-Based Skill, GLAD, and Dawid Skene), in nearly all scenarios, delivering
higher average accuracy, F1 scores, and AUC rates. Additionally, we introduce a
variation of two existing confidence score measurement techniques. Finally we
evaluate these two confidence score techniques using two evaluation metrics:
Expected Calibration Error (ECE) and Brier Score Loss. Our results show that
Crowd-Certain achieves higher Brier Score, and lower ECE across the majority of
the examined datasets, suggesting better calibrated results.
Related papers
- Distribution-Aware Robust Learning from Long-Tailed Data with Noisy Labels [8.14255560923536]
Real-world data often exhibit long-tailed distributions and label noise, significantly degrading generalization performance.
Recent studies have focused on noisy sample selection methods that estimate the centroid of each class based on high-confidence samples within each target class.
We present Distribution-aware Sample Selection and Contrastive Learning (DaSC) to generate enhanced class centroids.
arXiv Detail & Related papers (2024-07-23T19:06:15Z) - Data Quality in Crowdsourcing and Spamming Behavior Detection [2.6481162211614118]
We introduce a systematic method for evaluating data quality and detecting spamming threats via variance decomposition.
A spammer index is proposed to assess entire data consistency and two metrics are developed to measure crowd worker's credibility.
arXiv Detail & Related papers (2024-04-04T02:21:38Z) - Overcoming Overconfidence for Active Learning [1.2776312584227847]
We present two novel methods to address the problem of overconfidence that arises in the active learning scenario.
The first is an augmentation strategy named Cross-Mix-and-Mix (CMaM), which aims to calibrate the model by expanding the limited training distribution.
The second is a selection strategy named Ranked Margin Sampling (RankedMS), which prevents choosing data that leads to overly confident predictions.
arXiv Detail & Related papers (2023-08-21T09:04:54Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Semi-supervised Long-tailed Recognition using Alternate Sampling [95.93760490301395]
Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes.
We propose a new recognition setting, namely semi-supervised long-tailed recognition.
We demonstrate significant accuracy improvements over other competitive methods on two datasets.
arXiv Detail & Related papers (2021-05-01T00:43:38Z) - Improving Calibration for Long-Tailed Recognition [68.32848696795519]
We propose two methods to improve calibration and performance in such scenarios.
For dataset bias due to different samplers, we propose shifted batch normalization.
Our proposed methods set new records on multiple popular long-tailed recognition benchmark datasets.
arXiv Detail & Related papers (2021-04-01T13:55:21Z) - ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for
Semi-supervised Continual Learning [52.831894583501395]
Continual learning assumes the incoming data are fully labeled, which might not be applicable in real applications.
We propose deep Online Replay with Discriminator Consistency (ORDisCo) to interdependently learn a classifier with a conditional generative adversarial network (GAN)
We show ORDisCo achieves significant performance improvement on various semi-supervised learning benchmark datasets for SSCL.
arXiv Detail & Related papers (2021-01-02T09:04:14Z) - End-to-End Learning from Noisy Crowd to Supervised Machine Learning
Models [6.278267504352446]
We advocate using hybrid intelligence, i.e., combining deep models and human experts, to design an end-to-end learning framework from noisy crowd-sourced data.
We show how label aggregation can benefit from estimating the annotators' confusion matrix to improve the learning process.
We demonstrate the effectiveness of our strategies on several image datasets, using SVM and deep neural networks.
arXiv Detail & Related papers (2020-11-13T09:48:30Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.