Related papers: Batch Selection for Multi-Label Classification Guided by Uncertainty and Dynamic Label Correlations

Batch Selection for Multi-Label Classification Guided by Uncertainty and Dynamic Label Correlations

URL: http://arxiv.org/abs/2412.16521v1
Date: Sat, 21 Dec 2024 07:49:26 GMT
Title: Batch Selection for Multi-Label Classification Guided by Uncertainty and Dynamic Label Correlations
Authors: Ao Zhou, Bin Liu, Jin Wang, Grigorios Tsoumakas,
Abstract summary: We propose an uncertainty-based multi-label batch selection algorithm.<n>It assesses uncertainty for each label by considering differences between successive predictions and the confidence of current outputs.<n> Empirical studies demonstrate the effectiveness of our method in improving the performance.
Score: 9.360376286221943
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The accuracy of deep neural networks is significantly influenced by the effectiveness of mini-batch construction during training. In single-label scenarios, such as binary and multi-class classification tasks, it has been demonstrated that batch selection algorithms preferring samples with higher uncertainty achieve better performance than difficulty-based methods. Although there are two batch selection methods tailored for multi-label data, none of them leverage important uncertainty information. Adapting the concept of uncertainty to multi-label data is not a trivial task, since there are two issues that should be tackled. First, traditional variance or entropy-based uncertainty measures ignore fluctuations of predictions within sliding windows and the importance of the current model state. Second, existing multi-label methods do not explicitly exploit the label correlations, particularly the uncertainty-based label correlations that evolve during the training process. In this paper, we propose an uncertainty-based multi-label batch selection algorithm. It assesses uncertainty for each label by considering differences between successive predictions and the confidence of current outputs, and further leverages dynamic uncertainty-based label correlations to emphasize instances whose uncertainty is synergistically expressed across multiple labels. Empirical studies demonstrate the effectiveness of our method in improving the performance and accelerating the convergence of various multi-label deep learning models.

Related papers

DiCaP: Distribution-Calibrated Pseudo-labeling for Semi-Supervised Multi-Label Learning [83.94574004953346]
Semi-supervised multi-label learning aims to leverage unlabeled data to improve the model's performance.<n>Most existing methods assign equal weights to all pseudo-labels regardless of their quality.<n>We propose Distribution-Calibrated Pseudo-labeling (DiCaP), a correctness-aware framework that estimates posterior precision to calibrate pseudo-label weights.
arXiv Detail & Related papers (2025-11-25T11:55:02Z)
Rethinking Consistent Multi-Label Classification under Inexact Supervision [60.79309683889278]
In partial multi-label learning, each instance is annotated with a candidate label set, among which only some labels are relevant.<n>In complementary multi-label learning, each instance is annotated with complementary labels indicating the classes to which the instance does not belong.
arXiv Detail & Related papers (2025-10-05T08:30:32Z)
Noise-Resistant Label Reconstruction Feature Selection for Partial Multi-Label Learning [3.635311806373203]
"Curse of dimensionality" is prevalent across various data patterns, which increases the risk of model overfitting and leads to a decline in model classification performance.<n>Existing Partial Multi-label Learning (PML) methods addressing this problem are mainly based on the low-rank assumption.<n>In this paper, a PML feature selection method is proposed considering two important characteristics of dataset.
arXiv Detail & Related papers (2025-06-05T06:31:04Z)
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z)
Dynamic Correlation Learning and Regularization for Multi-Label Confidence Calibration [60.95748658638956]
This paper introduces the Multi-Label Confidence task, aiming to provide well-calibrated confidence scores in multi-label scenarios. Existing single-label calibration methods fail to account for category correlations, which are crucial for addressing semantic confusion. We propose the Dynamic Correlation Learning and Regularization algorithm, which leverages multi-grained semantic correlations to better model semantic confusion.
arXiv Detail & Related papers (2024-07-09T13:26:21Z)
Leveraging Ensemble Diversity for Robust Self-Training in the Presence of Sample Selection Bias [5.698050337128548]
Self-training is a well-known approach for semi-supervised learning. It consists of iteratively assigning pseudo-labels to unlabeled data for which the model is confident and treating them as labeled examples. For neural networks, softmax prediction probabilities are often used as a confidence measure, although they are known to be overconfident, even for wrong predictions. We propose a novel confidence measure, called $mathcalT$-similarity, built upon the prediction diversity of an ensemble of linear classifiers.
arXiv Detail & Related papers (2023-10-23T11:30:06Z)
Multi-Label Noise Transition Matrix Estimation with Label Correlations: Theory and Algorithm [73.94839250910977]
Noisy multi-label learning has garnered increasing attention due to the challenges posed by collecting large-scale accurate labels. The introduction of transition matrices can help model multi-label noise and enable the development of statistically consistent algorithms. We propose a novel estimator that leverages label correlations without the need for anchor points or precise fitting of noisy class posteriors.
arXiv Detail & Related papers (2023-09-22T08:35:38Z)
Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data. This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z)
Aggregating Soft Labels from Crowd Annotations Improves Uncertainty Estimation Under Distribution Shift [43.69579155156202]
This paper provides the first large-scale empirical study on learning from crowd labels in the out-of-domain setting. We propose to aggregate soft-labels via a simple average in order to achieve consistent performance across tasks.
arXiv Detail & Related papers (2022-12-19T12:40:18Z)
Exploiting Completeness and Uncertainty of Pseudo Labels for Weakly Supervised Video Anomaly Detection [149.23913018423022]
Weakly supervised video anomaly detection aims to identify abnormal events in videos using only video-level labels. Two-stage self-training methods have achieved significant improvements by self-generating pseudo labels. We propose an enhancement framework by exploiting completeness and uncertainty properties for effective self-training.
arXiv Detail & Related papers (2022-12-08T05:53:53Z)
Going Beyond One-Hot Encoding in Classification: Can Human Uncertainty Improve Model Performance? [14.610038284393166]
We show that label uncertainty is explicitly embedded into the training process via distributional labels. The incorporation of label uncertainty helps the model to generalize better to unseen data and increases model performance. Similar to existing calibration methods, the distributional labels lead to better-calibrated probabilities, which in turn yield more certain and trustworthy predictions.
arXiv Detail & Related papers (2022-05-30T17:19:11Z)
Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data. We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z)
Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator. Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples. We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z)
Mitigating Class Boundary Label Uncertainty to Reduce Both Model Bias and Variance [4.563176550691304]
We investigate a new approach to handle inaccuracy and uncertainty in the training data labels. Our method can reduce both bias and variance by estimating the pointwise label uncertainty of the training set.
arXiv Detail & Related papers (2020-02-23T18:24:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.