Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification
- URL: http://arxiv.org/abs/2207.09237v2
- Date: Sat, 30 Mar 2024 11:55:26 GMT
- Title: Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification
- Authors: Jurica Levatić, Michelangelo Ceci, Dragi Kocev, Sašo Džeroski,
- Abstract summary: We propose a hierarchical multi-label classification method based on semi-supervised learning of predictive clustering trees.
We also extend the method towards ensemble learning and propose a method based on the random forest approach.
- Score: 2.706328351174805
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Semi-supervised learning (SSL) is a common approach to learning predictive models using not only labeled examples, but also unlabeled examples. While SSL for the simple tasks of classification and regression has received a lot of attention from the research community, this is not properly investigated for complex prediction tasks with structurally dependent variables. This is the case of multi-label classification and hierarchical multi-label classification tasks, which may require additional information, possibly coming from the underlying distribution in the descriptive space provided by unlabeled examples, to better face the challenging task of predicting simultaneously multiple class labels. In this paper, we investigate this aspect and propose a (hierarchical) multi-label classification method based on semi-supervised learning of predictive clustering trees. We also extend the method towards ensemble learning and propose a method based on the random forest approach. Extensive experimental evaluation conducted on 23 datasets shows significant advantages of the proposed method and its extension with respect to their supervised counterparts. Moreover, the method preserves interpretability and reduces the time complexity of classical tree-based models.
Related papers
- Liberating Seen Classes: Boosting Few-Shot and Zero-Shot Text Classification via Anchor Generation and Classification Reframing [38.84431954053434]
Few-shot and zero-shot text classification aim to recognize samples from novel classes with limited labeled samples or no labeled samples at all.
We propose a simple and effective strategy for few-shot and zero-shot text classification.
arXiv Detail & Related papers (2024-05-06T15:38:32Z) - Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - Classification Tree-based Active Learning: A Wrapper Approach [4.706932040794696]
This paper proposes a wrapper active learning method for classification, organizing the sampling process into a tree structure.
A classification tree constructed on an initial set of labeled samples is considered to decompose the space into low-entropy regions.
This adaptation proves to be a significant enhancement over existing active learning methods.
arXiv Detail & Related papers (2024-04-15T17:27:00Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - An Additive Instance-Wise Approach to Multi-class Model Interpretation [53.87578024052922]
Interpretable machine learning offers insights into what factors drive a certain prediction of a black-box system.
Existing methods mainly focus on selecting explanatory input features, which follow either locally additive or instance-wise approaches.
This work exploits the strengths of both methods and proposes a global framework for learning local explanations simultaneously for multiple target classes.
arXiv Detail & Related papers (2022-07-07T06:50:27Z) - Evolving Multi-Label Fuzzy Classifier [5.53329677986653]
Multi-label classification has attracted much attention in the machine learning community to address the problem of assigning single samples to more than one class at the same time.
We propose an evolving multi-label fuzzy classifier (EFC-ML) which is able to self-adapt and self-evolve its structure with new incoming multi-label samples in an incremental, single-pass manner.
arXiv Detail & Related papers (2022-03-29T08:01:03Z) - A Top-down Supervised Learning Approach to Hierarchical Multi-label
Classification in Networks [0.21485350418225244]
This paper presents a general prediction model to hierarchical multi-label classification (HMC), where the attributes to be inferred can be specified as a strict poset.
It is based on a top-down classification approach that addresses hierarchical multi-label classification with supervised learning by building a local classifier per class.
The proposed model is showcased with a case study on the prediction of gene functions for Oryza sativa Japonica, a variety of rice.
arXiv Detail & Related papers (2022-03-23T17:29:17Z) - Self-Training: A Survey [5.772546394254112]
Semi-supervised algorithms aim to learn prediction functions from a small set of labeled observations and a large set of unlabeled observations.
Among the existing techniques, self-training methods have undoubtedly attracted greater attention in recent years.
We present self-training methods for binary and multi-class classification; as well as their variants and two related approaches.
arXiv Detail & Related papers (2022-02-24T11:40:44Z) - Learning Debiased and Disentangled Representations for Semantic
Segmentation [52.35766945827972]
We propose a model-agnostic and training scheme for semantic segmentation.
By randomly eliminating certain class information in each training iteration, we effectively reduce feature dependencies among classes.
Models trained with our approach demonstrate strong results on multiple semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-31T16:15:09Z) - Intersection Regularization for Extracting Semantic Attributes [72.53481390411173]
We consider the problem of supervised classification, such that the features that the network extracts match an unseen set of semantic attributes.
For example, when learning to classify images of birds into species, we would like to observe the emergence of features that zoologists use to classify birds.
We propose training a neural network with discrete top-level activations, which is followed by a multi-layered perceptron (MLP) and a parallel decision tree.
arXiv Detail & Related papers (2021-03-22T14:32:44Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.