ActiveLab: Active Learning with Re-Labeling by Multiple Annotators
- URL: http://arxiv.org/abs/2301.11856v1
- Date: Fri, 27 Jan 2023 17:00:11 GMT
- Title: ActiveLab: Active Learning with Re-Labeling by Multiple Annotators
- Authors: Hui Wen Goh, Jonas Mueller
- Abstract summary: ActiveLab is a method to decide what to label next in batch active learning.
It automatically estimates when it is more informative to re-label examples vs. labeling entirely new ones.
It reliably trains more accurate classifiers with far fewer annotations than a wide variety of popular active learning methods.
- Score: 19.84626033109009
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In real-world data labeling applications, annotators often provide imperfect
labels. It is thus common to employ multiple annotators to label data with some
overlap between their examples. We study active learning in such settings,
aiming to train an accurate classifier by collecting a dataset with the fewest
total annotations. Here we propose ActiveLab, a practical method to decide what
to label next that works with any classifier model and can be used in
pool-based batch active learning with one or multiple annotators. ActiveLab
automatically estimates when it is more informative to re-label examples vs.
labeling entirely new ones. This is a key aspect of producing high quality
labels and trained models within a limited annotation budget. In experiments on
image and tabular data, ActiveLab reliably trains more accurate classifiers
with far fewer annotations than a wide variety of popular active learning
methods.
Related papers
- Federated Learning with Only Positive Labels by Exploring Label Correlations [78.59613150221597]
Federated learning aims to collaboratively learn a model by using the data from multiple users under privacy constraints.
In this paper, we study the multi-label classification problem under the federated learning setting.
We propose a novel and generic method termed Federated Averaging by exploring Label Correlations (FedALC)
arXiv Detail & Related papers (2024-04-24T02:22:50Z) - Robust Assignment of Labels for Active Learning with Sparse and Noisy
Annotations [0.17188280334580192]
Supervised classification algorithms are used to solve a growing number of real-life problems around the globe.
Unfortunately, acquiring good-quality annotations for many tasks is infeasible or too expensive to be done in practice.
We propose two novel annotation unification algorithms that utilize unlabeled parts of the sample space.
arXiv Detail & Related papers (2023-07-25T19:40:41Z) - Exploiting Diversity of Unlabeled Data for Label-Efficient
Semi-Supervised Active Learning [57.436224561482966]
Active learning is a research area that addresses the issues of expensive labeling by selecting the most important samples for labeling.
We introduce a new diversity-based initial dataset selection algorithm to select the most informative set of samples for initial labeling in the active learning setting.
Also, we propose a novel active learning query strategy, which uses diversity-based sampling on consistency-based embeddings.
arXiv Detail & Related papers (2022-07-25T16:11:55Z) - Learning from Multiple Annotator Noisy Labels via Sample-wise Label
Fusion [17.427778867371153]
In some real-world applications, accurate labeling might not be viable.
Multiple noisy labels are provided by several annotators for each data sample.
arXiv Detail & Related papers (2022-07-22T20:38:20Z) - Trustable Co-label Learning from Multiple Noisy Annotators [68.59187658490804]
Supervised deep learning depends on massive accurately annotated examples.
A typical alternative is learning from multiple noisy annotators.
This paper proposes a data-efficient approach, called emphTrustable Co-label Learning (TCL)
arXiv Detail & Related papers (2022-03-08T16:57:00Z) - A new data augmentation method for intent classification enhancement and
its application on spoken conversation datasets [23.495743195811375]
We present the Nearest Neighbors Scores Improvement (NNSI) algorithm for automatic data selection and labeling.
The NNSI reduces the need for manual labeling by automatically selecting highly-ambiguous samples and labeling them with high accuracy.
We demonstrated the use of NNSI on two large-scale, real-life voice conversation systems.
arXiv Detail & Related papers (2022-02-21T11:36:19Z) - Active Learning in Incomplete Label Multiple Instance Multiple Label
Learning [17.5720245903743]
We propose a novel bag-class pair based approach for active learning in the MIML setting.
Our approach is based on a discriminative graphical model with efficient and exact inference.
arXiv Detail & Related papers (2021-07-22T17:01:28Z) - SLADE: A Self-Training Framework For Distance Metric Learning [75.54078592084217]
We present a self-training framework, SLADE, to improve retrieval performance by leveraging additional unlabeled data.
We first train a teacher model on the labeled data and use it to generate pseudo labels for the unlabeled data.
We then train a student model on both labels and pseudo labels to generate final feature embeddings.
arXiv Detail & Related papers (2020-11-20T08:26:10Z) - Few-shot Learning for Multi-label Intent Detection [59.66787898744991]
State-of-the-art work estimates label-instance relevance scores and uses a threshold to select multiple associated intent labels.
Experiments on two datasets show that the proposed model significantly outperforms strong baselines in both one-shot and five-shot settings.
arXiv Detail & Related papers (2020-10-11T14:42:18Z) - Interaction Matching for Long-Tail Multi-Label Classification [57.262792333593644]
We present an elegant and effective approach for addressing limitations in existing multi-label classification models.
By performing soft n-gram interaction matching, we match labels with natural language descriptions.
arXiv Detail & Related papers (2020-05-18T15:27:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.