Learning Gradient Boosted Multi-label Classification Rules
- URL: http://arxiv.org/abs/2006.13346v1
- Date: Tue, 23 Jun 2020 21:39:23 GMT
- Title: Learning Gradient Boosted Multi-label Classification Rules
- Authors: Michael Rapp, Eneldo Loza Menc\'ia, Johannes F\"urnkranz, Vu-Linh
Nguyen, Eyke H\"ullermeier
- Abstract summary: We propose an algorithm for learning multi-label classification rules that is able to minimize decomposable as well as non-decomposable loss functions.
We analyze the abilities and limitations of our approach on synthetic data and evaluate its predictive performance on multi-label benchmarks.
- Score: 4.842945656927122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multi-label classification, where the evaluation of predictions is less
straightforward than in single-label classification, various meaningful, though
different, loss functions have been proposed. Ideally, the learning algorithm
should be customizable towards a specific choice of the performance measure.
Modern implementations of boosting, most prominently gradient boosted decision
trees, appear to be appealing from this point of view. However, they are mostly
limited to single-label classification, and hence not amenable to multi-label
losses unless these are label-wise decomposable. In this work, we develop a
generalization of the gradient boosting framework to multi-output problems and
propose an algorithm for learning multi-label classification rules that is able
to minimize decomposable as well as non-decomposable loss functions. Using the
well-known Hamming loss and subset 0/1 loss as representatives, we analyze the
abilities and limitations of our approach on synthetic data and evaluate its
predictive performance on multi-label benchmarks.
Related papers
- Dual-Decoupling Learning and Metric-Adaptive Thresholding for Semi-Supervised Multi-Label Learning [81.83013974171364]
Semi-supervised multi-label learning (SSMLL) is a powerful framework for leveraging unlabeled data to reduce the expensive cost of collecting precise multi-label annotations.
Unlike semi-supervised learning, one cannot select the most probable label as the pseudo-label in SSMLL due to multiple semantics contained in an instance.
We propose a dual-perspective method to generate high-quality pseudo-labels.
arXiv Detail & Related papers (2024-07-26T09:33:53Z) - Hierarchical classification at multiple operating points [1.520694326234112]
We present an efficient algorithm to produce operating characteristic curves for any method that assigns a score to every class in the hierarchy.
We propose two novel loss functions and show that a soft variant of the structured hinge loss is able to significantly outperform the flat baseline.
arXiv Detail & Related papers (2022-10-19T23:36:16Z) - PercentMatch: Percentile-based Dynamic Thresholding for Multi-Label
Semi-Supervised Classification [64.39761523935613]
We propose a percentile-based threshold adjusting scheme to dynamically alter the score thresholds of positive and negative pseudo-labels for each class during the training.
We achieve strong performance on Pascal VOC2007 and MS-COCO datasets when compared to recent SSL methods.
arXiv Detail & Related papers (2022-08-30T01:27:48Z) - Learning with Proper Partial Labels [87.65718705642819]
Partial-label learning is a kind of weakly-supervised learning with inexact labels.
We show that this proper partial-label learning framework includes many previous partial-label learning settings.
We then derive a unified unbiased estimator of the classification risk.
arXiv Detail & Related papers (2021-12-23T01:37:03Z) - Unbiased Loss Functions for Multilabel Classification with Missing
Labels [2.1549398927094874]
Missing labels are a ubiquitous phenomenon in extreme multi-label classification (XMC) tasks.
This paper derives the unique unbiased estimators for the different multilabel reductions.
arXiv Detail & Related papers (2021-09-23T10:39:02Z) - sigmoidF1: A Smooth F1 Score Surrogate Loss for Multilabel
Classification [42.37189502220329]
We propose a loss function, sigmoidF1, to account for the complexity of multilabel classification evaluation.
We show that sigmoidF1 outperforms other loss functions on four datasets and several metrics.
arXiv Detail & Related papers (2021-08-24T08:11:33Z) - Gradient-based Label Binning in Multi-label Classification [0.0]
In multi-label classification, the ability to model dependencies between labels is crucial to effectively optimize non-decomposable evaluation measures.
The utilization of second-order derivatives, as used by many recent boosting approaches, helps to guide the minimization of non-decomposable losses.
In this work, we address the computational bottleneck of such approach by integrating a novel approximation technique into the boosting procedure.
arXiv Detail & Related papers (2021-06-22T11:48:48Z) - Minimax Active Learning [61.729667575374606]
Active learning aims to develop label-efficient algorithms by querying the most representative samples to be labeled by a human annotator.
Current active learning techniques either rely on model uncertainty to select the most uncertain samples or use clustering or reconstruction to choose the most diverse set of unlabeled examples.
We develop a semi-supervised minimax entropy-based active learning algorithm that leverages both uncertainty and diversity in an adversarial manner.
arXiv Detail & Related papers (2020-12-18T19:03:40Z) - Theoretical Insights Into Multiclass Classification: A High-dimensional
Asymptotic View [82.80085730891126]
We provide the first modernally precise analysis of linear multiclass classification.
Our analysis reveals that the classification accuracy is highly distribution-dependent.
The insights gained may pave the way for a precise understanding of other classification algorithms.
arXiv Detail & Related papers (2020-11-16T05:17:29Z) - Classification with Rejection Based on Cost-sensitive Classification [83.50402803131412]
We propose a novel method of classification with rejection by ensemble of learning.
Experimental results demonstrate the usefulness of our proposed approach in clean, noisy, and positive-unlabeled classification.
arXiv Detail & Related papers (2020-10-22T14:05:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.