Gradient-based Label Binning in Multi-label Classification
- URL: http://arxiv.org/abs/2106.11690v1
- Date: Tue, 22 Jun 2021 11:48:48 GMT
- Title: Gradient-based Label Binning in Multi-label Classification
- Authors: Michael Rapp, Eneldo Loza Menc\'ia, Johannes F\"urnkranz, Eyke
H\"ullermeier
- Abstract summary: In multi-label classification, the ability to model dependencies between labels is crucial to effectively optimize non-decomposable evaluation measures.
The utilization of second-order derivatives, as used by many recent boosting approaches, helps to guide the minimization of non-decomposable losses.
In this work, we address the computational bottleneck of such approach by integrating a novel approximation technique into the boosting procedure.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multi-label classification, where a single example may be associated with
several class labels at the same time, the ability to model dependencies
between labels is considered crucial to effectively optimize non-decomposable
evaluation measures, such as the Subset 0/1 loss. The gradient boosting
framework provides a well-studied foundation for learning models that are
specifically tailored to such a loss function and recent research attests the
ability to achieve high predictive accuracy in the multi-label setting. The
utilization of second-order derivatives, as used by many recent boosting
approaches, helps to guide the minimization of non-decomposable losses, due to
the information about pairs of labels it incorporates into the optimization
process. On the downside, this comes with high computational costs, even if the
number of labels is small. In this work, we address the computational
bottleneck of such approach -- the need to solve a system of linear equations
-- by integrating a novel approximation technique into the boosting procedure.
Based on the derivatives computed during training, we dynamically group the
labels into a predefined number of bins to impose an upper bound on the
dimensionality of the linear system. Our experiments, using an existing
rule-based algorithm, suggest that this may boost the speed of training,
without any significant loss in predictive performance.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - LayerMatch: Do Pseudo-labels Benefit All Layers? [77.59625180366115]
Semi-supervised learning offers a promising solution to mitigate the dependency of labeled data.
We develop two layer-specific pseudo-label strategies, termed Grad-ReLU and Avg-Clustering.
Our approach consistently demonstrates exceptional performance on standard semi-supervised learning benchmarks.
arXiv Detail & Related papers (2024-06-20T11:25:50Z) - Multi-Label Noise Transition Matrix Estimation with Label Correlations:
Theory and Algorithm [73.94839250910977]
Noisy multi-label learning has garnered increasing attention due to the challenges posed by collecting large-scale accurate labels.
The introduction of transition matrices can help model multi-label noise and enable the development of statistically consistent algorithms.
We propose a novel estimator that leverages label correlations without the need for anchor points or precise fitting of noisy class posteriors.
arXiv Detail & Related papers (2023-09-22T08:35:38Z) - An Accelerated Doubly Stochastic Gradient Method with Faster Explicit
Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems.
We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z) - Adaptive label thresholding methods for online multi-label
classification [4.028101568570768]
Existing online multi-label classification works cannot handle the online label thresholding problem.
This paper proposes a novel framework of adaptive label thresholding algorithms for online multi-label classification.
arXiv Detail & Related papers (2021-12-04T10:34:09Z) - Model-Change Active Learning in Graph-Based Semi-Supervised Learning [5.174023161939957]
"Model Change" active learning quantifies the resulting change by introducing the additional label(s)
We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution.
arXiv Detail & Related papers (2021-10-14T21:47:10Z) - Semi-Supervised Learning with Meta-Gradient [123.26748223837802]
We propose a simple yet effective meta-learning algorithm in semi-supervised learning.
We find that the proposed algorithm performs favorably against state-of-the-art methods.
arXiv Detail & Related papers (2020-07-08T08:48:56Z) - Learning Gradient Boosted Multi-label Classification Rules [4.842945656927122]
We propose an algorithm for learning multi-label classification rules that is able to minimize decomposable as well as non-decomposable loss functions.
We analyze the abilities and limitations of our approach on synthetic data and evaluate its predictive performance on multi-label benchmarks.
arXiv Detail & Related papers (2020-06-23T21:39:23Z) - Progressive Identification of True Labels for Partial-Label Learning [112.94467491335611]
Partial-label learning (PLL) is a typical weakly supervised learning problem, where each training instance is equipped with a set of candidate labels among which only one is the true label.
Most existing methods elaborately designed as constrained optimizations that must be solved in specific manners, making their computational complexity a bottleneck for scaling up to big data.
This paper proposes a novel framework of classifier with flexibility on the model and optimization algorithm.
arXiv Detail & Related papers (2020-02-19T08:35:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.