Related papers: Gradient-based Label Binning in Multi-label Classification

Gradient-based Label Binning in Multi-label Classification

URL: http://arxiv.org/abs/2106.11690v1
Date: Tue, 22 Jun 2021 11:48:48 GMT
Title: Gradient-based Label Binning in Multi-label Classification
Authors: Michael Rapp, Eneldo Loza Menc\'ia, Johannes F\"urnkranz, Eyke H\"ullermeier
Abstract summary: In multi-label classification, the ability to model dependencies between labels is crucial to effectively optimize non-decomposable evaluation measures. The utilization of second-order derivatives, as used by many recent boosting approaches, helps to guide the minimization of non-decomposable losses. In this work, we address the computational bottleneck of such approach by integrating a novel approximation technique into the boosting procedure.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In multi-label classification, where a single example may be associated with several class labels at the same time, the ability to model dependencies between labels is considered crucial to effectively optimize non-decomposable evaluation measures, such as the Subset 0/1 loss. The gradient boosting framework provides a well-studied foundation for learning models that are specifically tailored to such a loss function and recent research attests the ability to achieve high predictive accuracy in the multi-label setting. The utilization of second-order derivatives, as used by many recent boosting approaches, helps to guide the minimization of non-decomposable losses, due to the information about pairs of labels it incorporates into the optimization process. On the downside, this comes with high computational costs, even if the number of labels is small. In this work, we address the computational bottleneck of such approach -- the need to solve a system of linear equations -- by integrating a novel approximation technique into the boosting procedure. Based on the derivatives computed during training, we dynamically group the labels into a predefined number of bins to impose an upper bound on the dimensionality of the linear system. Our experiments, using an existing rule-based algorithm, suggest that this may boost the speed of training, without any significant loss in predictive performance.

Related papers

Linearly Convergent Mixup Learning [0.0]
We present two novel algorithms that extend to a broader range of binary classification models. Unlike gradient-based approaches, our algorithms do not require hyper parameters like learning rates, simplifying their implementation and optimization. Our algorithms achieve faster convergence to the optimal solution compared to descent gradient approaches, and that mixup data augmentation consistently improves the predictive performance across various loss functions.
arXiv Detail & Related papers (2025-01-14T02:33:40Z)
Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations. Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance. We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z)
LayerMatch: Do Pseudo-labels Benefit All Layers? [77.59625180366115]
Semi-supervised learning offers a promising solution to mitigate the dependency of labeled data. We develop two layer-specific pseudo-label strategies, termed Grad-ReLU and Avg-Clustering. Our approach consistently demonstrates exceptional performance on standard semi-supervised learning benchmarks.
arXiv Detail & Related papers (2024-06-20T11:25:50Z)
Multi-Label Noise Transition Matrix Estimation with Label Correlations: Theory and Algorithm [73.94839250910977]
Noisy multi-label learning has garnered increasing attention due to the challenges posed by collecting large-scale accurate labels. The introduction of transition matrices can help model multi-label noise and enable the development of statistically consistent algorithms. We propose a novel estimator that leverages label correlations without the need for anchor points or precise fitting of noisy class posteriors.
arXiv Detail & Related papers (2023-09-22T08:35:38Z)
An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems. We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z)
Adaptive label thresholding methods for online multi-label classification [4.028101568570768]
Existing online multi-label classification works cannot handle the online label thresholding problem. This paper proposes a novel framework of adaptive label thresholding algorithms for online multi-label classification.
arXiv Detail & Related papers (2021-12-04T10:34:09Z)
Model-Change Active Learning in Graph-Based Semi-Supervised Learning [5.174023161939957]
"Model Change" active learning quantifies the resulting change by introducing the additional label(s) We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution.
arXiv Detail & Related papers (2021-10-14T21:47:10Z)
Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning. We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z)
Learning Gradient Boosted Multi-label Classification Rules [4.842945656927122]
We propose an algorithm for learning multi-label classification rules that is able to minimize decomposable as well as non-decomposable loss functions. We analyze the abilities and limitations of our approach on synthetic data and evaluate its predictive performance on multi-label benchmarks.
arXiv Detail & Related papers (2020-06-23T21:39:23Z)
Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label. Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data. This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.