Robust Assignment of Labels for Active Learning with Sparse and Noisy
Annotations
- URL: http://arxiv.org/abs/2307.14380v1
- Date: Tue, 25 Jul 2023 19:40:41 GMT
- Title: Robust Assignment of Labels for Active Learning with Sparse and Noisy
Annotations
- Authors: Daniel Ka{\l}u\.za and Andrzej Janusz and Dominik \'Sl\k{e}zak
- Abstract summary: Supervised classification algorithms are used to solve a growing number of real-life problems around the globe.
Unfortunately, acquiring good-quality annotations for many tasks is infeasible or too expensive to be done in practice.
We propose two novel annotation unification algorithms that utilize unlabeled parts of the sample space.
- Score: 0.17188280334580192
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Supervised classification algorithms are used to solve a growing number of
real-life problems around the globe. Their performance is strictly connected
with the quality of labels used in training. Unfortunately, acquiring
good-quality annotations for many tasks is infeasible or too expensive to be
done in practice. To tackle this challenge, active learning algorithms are
commonly employed to select only the most relevant data for labeling. However,
this is possible only when the quality and quantity of labels acquired from
experts are sufficient. Unfortunately, in many applications, a trade-off
between annotating individual samples by multiple annotators to increase label
quality vs. annotating new samples to increase the total number of labeled
instances is necessary. In this paper, we address the issue of faulty data
annotations in the context of active learning. In particular, we propose two
novel annotation unification algorithms that utilize unlabeled parts of the
sample space. The proposed methods require little to no intersection between
samples annotated by different experts. Our experiments on four public datasets
indicate the robustness and superiority of the proposed methods in both, the
estimation of the annotator's reliability, and the assignment of actual labels,
against the state-of-the-art algorithms and the simple majority voting.
Related papers
- Determined Multi-Label Learning via Similarity-Based Prompt [12.428779617221366]
In multi-label classification, each training instance is associated with multiple class labels simultaneously.
To alleviate this problem, a novel labeling setting termed textitDetermined Multi-Label Learning (DMLL) is proposed.
arXiv Detail & Related papers (2024-03-25T07:08:01Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - Drawing the Same Bounding Box Twice? Coping Noisy Annotations in Object
Detection with Repeated Labels [6.872072177648135]
We propose a novel localization algorithm that adapts well-established ground truth estimation methods.
Our algorithm also shows superior performance during training on the TexBiG dataset.
arXiv Detail & Related papers (2023-09-18T13:08:44Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - An Effective Approach for Multi-label Classification with Missing Labels [8.470008570115146]
We propose a pseudo-label based approach to reduce the cost of annotation without bringing additional complexity to the classification networks.
By designing a novel loss function, we are able to relax the requirement that each instance must contain at least one positive label.
We show that our method can handle the imbalance between positive labels and negative labels, while still outperforming existing missing-label learning approaches.
arXiv Detail & Related papers (2022-10-24T23:13:57Z) - One Positive Label is Sufficient: Single-Positive Multi-Label Learning
with Label Enhancement [71.9401831465908]
We investigate single-positive multi-label learning (SPMLL) where each example is annotated with only one relevant label.
A novel method named proposed, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed.
Experiments on benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-06-01T14:26:30Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Learning with Proper Partial Labels [87.65718705642819]
Partial-label learning is a kind of weakly-supervised learning with inexact labels.
We show that this proper partial-label learning framework includes many previous partial-label learning settings.
We then derive a unified unbiased estimator of the classification risk.
arXiv Detail & Related papers (2021-12-23T01:37:03Z) - Learning with Noisy Labels by Targeted Relabeling [52.0329205268734]
Crowdsourcing platforms are often used to collect datasets for training deep neural networks.
We propose an approach which reserves a fraction of annotations to explicitly relabel highly probable labeling errors.
arXiv Detail & Related papers (2021-10-15T20:37:29Z) - Learning with Different Amounts of Annotation: From Zero to Many Labels [19.869498599986006]
Training NLP systems typically assume access to annotated data that has a single human label per example.
We explore new annotation distribution schemes, assigning multiple labels per example for a small subset of training examples.
Introducing such multi label examples at the cost of annotating fewer examples brings clear gains on natural language inference task and entity typing task.
arXiv Detail & Related papers (2021-09-09T16:48:41Z) - Importance Reweighting for Biquality Learning [0.0]
This paper proposes an original, encompassing, view of Weakly Supervised Learning.
It results in the design of generic approaches capable of dealing with any kind of label noise.
In this paper, we propose a new reweigthing scheme capable of identifying noncorrupted examples in the untrusted dataset.
arXiv Detail & Related papers (2020-10-19T15:59:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.