Don't Waste a Single Annotation: Improving Single-Label Classifiers
Through Soft Labels
- URL: http://arxiv.org/abs/2311.05265v1
- Date: Thu, 9 Nov 2023 10:47:39 GMT
- Title: Don't Waste a Single Annotation: Improving Single-Label Classifiers
Through Soft Labels
- Authors: Ben Wu, Yue Li, Yida Mu, Carolina Scarton, Kalina Bontcheva and Xingyi
Song
- Abstract summary: We address the limitations of the common data annotation and training methods for objective single-label classification tasks.
Our findings indicate that additional annotator information, such as confidence, secondary label and disagreement, can be used to effectively generate soft labels.
- Score: 7.396461226948109
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we address the limitations of the common data annotation and
training methods for objective single-label classification tasks. Typically,
when annotating such tasks annotators are only asked to provide a single label
for each sample and annotator disagreement is discarded when a final hard label
is decided through majority voting. We challenge this traditional approach,
acknowledging that determining the appropriate label can be difficult due to
the ambiguity and lack of context in the data samples. Rather than discarding
the information from such ambiguous annotations, our soft label method makes
use of them for training. Our findings indicate that additional annotator
information, such as confidence, secondary label and disagreement, can be used
to effectively generate soft labels. Training classifiers with these soft
labels then leads to improved performance and calibration on the hard label
test set.
Related papers
- Determined Multi-Label Learning via Similarity-Based Prompt [12.428779617221366]
In multi-label classification, each training instance is associated with multiple class labels simultaneously.
To alleviate this problem, a novel labeling setting termed textitDetermined Multi-Label Learning (DMLL) is proposed.
arXiv Detail & Related papers (2024-03-25T07:08:01Z) - Robust Assignment of Labels for Active Learning with Sparse and Noisy
Annotations [0.17188280334580192]
Supervised classification algorithms are used to solve a growing number of real-life problems around the globe.
Unfortunately, acquiring good-quality annotations for many tasks is infeasible or too expensive to be done in practice.
We propose two novel annotation unification algorithms that utilize unlabeled parts of the sample space.
arXiv Detail & Related papers (2023-07-25T19:40:41Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - Partial-Label Regression [54.74984751371617]
Partial-label learning is a weakly supervised learning setting that allows each training example to be annotated with a set of candidate labels.
Previous studies on partial-label learning only focused on the classification setting where candidate labels are all discrete.
In this paper, we provide the first attempt to investigate partial-label regression, where each training example is annotated with a set of real-valued candidate labels.
arXiv Detail & Related papers (2023-06-15T09:02:24Z) - Imprecise Label Learning: A Unified Framework for Learning with Various Imprecise Label Configurations [91.67511167969934]
imprecise label learning (ILL) is a framework for the unification of learning with various imprecise label configurations.
We demonstrate that ILL can seamlessly adapt to partial label learning, semi-supervised learning, noisy label learning, and, more importantly, a mixture of these settings.
arXiv Detail & Related papers (2023-05-22T04:50:28Z) - ScarceNet: Animal Pose Estimation with Scarce Annotations [74.48263583706712]
ScarceNet is a pseudo label-based approach to generate artificial labels for the unlabeled images.
We evaluate our approach on the challenging AP-10K dataset, where our approach outperforms existing semi-supervised approaches by a large margin.
arXiv Detail & Related papers (2023-03-27T09:15:53Z) - Learning from Stochastic Labels [8.178975818137937]
Annotating multi-class instances is a crucial task in the field of machine learning.
In this paper, we propose a novel suitable approach to learn from these labels.
arXiv Detail & Related papers (2023-02-01T08:04:27Z) - Acknowledging the Unknown for Multi-label Learning with Single Positive
Labels [65.5889334964149]
Traditionally, all unannotated labels are assumed as negative labels in single positive multi-label learning (SPML)
We propose entropy-maximization (EM) loss to maximize the entropy of predicted probabilities for all unannotated labels.
Considering the positive-negative label imbalance of unannotated labels, we propose asymmetric pseudo-labeling (APL) with asymmetric-tolerance strategies and a self-paced procedure to provide more precise supervision.
arXiv Detail & Related papers (2022-03-30T11:43:59Z) - Learning to Purify Noisy Labels via Meta Soft Label Corrector [49.92310583232323]
Recent deep neural networks (DNNs) can easily overfit to biased training data with noisy labels.
Label correction strategy is commonly used to alleviate this issue.
We propose a meta-learning model which could estimate soft labels through meta-gradient descent step.
arXiv Detail & Related papers (2020-08-03T03:25:17Z) - Limitations of weak labels for embedding and tagging [0.0]
Many datasets and approaches in ambient sound analysis use weakly labeled data.Weak labels are employed because annotating every data sample with a strong label is too expensive.Yet, their impact on the performance in comparison to strong labels remains unclear.
In this paper, we formulate a supervised learning problem which involves weak labels.We create a dataset that focuses on the difference between strong and weak labels as opposed to other challenges.
arXiv Detail & Related papers (2020-02-05T08:54:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.