Maximising the Utility of Validation Sets for Imbalanced Noisy-label
Meta-learning
- URL: http://arxiv.org/abs/2208.08132v1
- Date: Wed, 17 Aug 2022 08:02:53 GMT
- Title: Maximising the Utility of Validation Sets for Imbalanced Noisy-label
Meta-learning
- Authors: Dung Anh Hoang and Cuong Nguyen anh Belagiannis Vasileios and Gustavo
Carneiro
- Abstract summary: We propose a new imbalanced noisy-label meta-learning (INOLML) algorithm that automatically builds a validation set by maximising its utility.
Our method shows significant improvements over previous meta-learning approaches and sets the new state-of-the-art on several benchmarks.
- Score: 16.318123642642075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta-learning is an effective method to handle imbalanced and noisy-label
learning, but it depends on a validation set containing randomly selected,
manually labelled and balanced distributed samples. The random selection and
manual labelling and balancing of this validation set is not only sub-optimal
for meta-learning, but it also scales poorly with the number of classes. Hence,
recent meta-learning papers have proposed ad-hoc heuristics to automatically
build and label this validation set, but these heuristics are still sub-optimal
for meta-learning. In this paper, we analyse the meta-learning algorithm and
propose new criteria to characterise the utility of the validation set, based
on: 1) the informativeness of the validation set; 2) the class distribution
balance of the set; and 3) the correctness of the labels of the set.
Furthermore, we propose a new imbalanced noisy-label meta-learning (INOLML)
algorithm that automatically builds a validation set by maximising its utility
using the criteria above. Our method shows significant improvements over
previous meta-learning approaches and sets the new state-of-the-art on several
benchmarks.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - One-bit Supervision for Image Classification: Problem, Solution, and
Beyond [114.95815360508395]
This paper presents one-bit supervision, a novel setting of learning with fewer labels, for image classification.
We propose a multi-stage training paradigm and incorporate negative label suppression into an off-the-shelf semi-supervised learning algorithm.
In multiple benchmarks, the learning efficiency of the proposed approach surpasses that using full-bit, semi-supervised supervision.
arXiv Detail & Related papers (2023-11-26T07:39:00Z) - Meta Objective Guided Disambiguation for Partial Label Learning [44.05801303440139]
We propose a novel framework for partial label learning with meta objective guided disambiguation (MoGD)
MoGD aims to recover the ground-truth label from candidate labels set by solving a meta objective on a small validation set.
The proposed method can be easily implemented by using various deep networks with the ordinary SGD.
arXiv Detail & Related papers (2022-08-26T06:48:01Z) - Label, Verify, Correct: A Simple Few Shot Object Detection Method [93.84801062680786]
We introduce a simple pseudo-labelling method to source high-quality pseudo-annotations from a training set.
We present two novel methods to improve the precision of the pseudo-labelling process.
Our method achieves state-of-the-art or second-best performance compared to existing approaches.
arXiv Detail & Related papers (2021-12-10T18:59:06Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Optimizing Black-box Metrics with Iterative Example Weighting [32.682652530189266]
We consider learning to optimize a classification metric defined by a black-box function of the confusion matrix.
Our approach is to adaptively learn example weights on the training dataset such that the resulting weighted objective best approximates the metric on the validation sample.
arXiv Detail & Related papers (2021-02-18T17:19:09Z) - Towards Model-Agnostic Post-Hoc Adjustment for Balancing Ranking
Fairness and Algorithm Utility [54.179859639868646]
Bipartite ranking aims to learn a scoring function that ranks positive individuals higher than negative ones from labeled data.
There have been rising concerns on whether the learned scoring function can cause systematic disparity across different protected groups.
We propose a model post-processing framework for balancing them in the bipartite ranking scenario.
arXiv Detail & Related papers (2020-06-15T10:08:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.