Oblique Predictive Clustering Trees
- URL: http://arxiv.org/abs/2007.13617v2
- Date: Thu, 5 Nov 2020 08:35:43 GMT
- Title: Oblique Predictive Clustering Trees
- Authors: Toma\v{z} Stepi\v{s}nik and Dragi Kocev
- Abstract summary: Predictive clustering trees (PCTs) can be used to solve a variety of predictive modeling tasks, including structured output prediction.
We propose oblique predictive clustering trees, capable of addressing these limitations.
We experimentally evaluate the proposed methods on 60 benchmark datasets for 6 predictive modeling tasks.
- Score: 6.317966126631351
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Predictive clustering trees (PCTs) are a well established generalization of
standard decision trees, which can be used to solve a variety of predictive
modeling tasks, including structured output prediction. Combining them into
ensembles yields state-of-the-art performance. Furthermore, the ensembles of
PCTs can be interpreted by calculating feature importance scores from the
learned models. However, their learning time scales poorly with the
dimensionality of the output space. This is often problematic, especially in
(hierarchical) multi-label classification, where the output can consist of
hundreds of potential labels. Also, learning of PCTs can not exploit the
sparsity of data to improve the computational efficiency, which is common in
both input (molecular fingerprints, bag of words representations) and output
spaces (in multi-label classification, examples are often labeled with only a
fraction of possible labels). In this paper, we propose oblique predictive
clustering trees, capable of addressing these limitations. We design and
implement two methods for learning oblique splits that contain linear
combinations of features in the tests, hence a split corresponds to an
arbitrary hyperplane in the input space. The methods are efficient for high
dimensional data and capable of exploiting sparse data. We experimentally
evaluate the proposed methods on 60 benchmark datasets for 6 predictive
modeling tasks. The results of the experiments show that oblique predictive
clustering trees achieve performance on-par with state-of-the-art methods and
are orders of magnitude faster than standard PCTs. We also show that meaningful
feature importance scores can be extracted from the models learned with the
proposed methods.
Related papers
- A Closer Look at Deep Learning on Tabular Data [52.50778536274327]
Tabular data is prevalent across various domains in machine learning.
Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones.
arXiv Detail & Related papers (2024-07-01T04:24:07Z) - A Unified Approach to Extract Interpretable Rules from Tree Ensembles via Integer Programming [2.1408617023874443]
Tree ensemble methods are known for their effectiveness in supervised classification and regression tasks.
Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model.
arXiv Detail & Related papers (2024-06-30T22:33:47Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Label Learning Method Based on Tensor Projection [82.51786483693206]
We propose a label learning method based on tensor projection (LLMTP)
We extend the matrix projection transformation to tensor projection, so that the spatial structure information between views can be fully utilized.
In addition, we introduce the tensor Schatten $p$-norm regularization to make the clustering label matrices of different views as consistent as possible.
arXiv Detail & Related papers (2024-02-26T13:03:26Z) - Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior.
The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z) - Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification [2.706328351174805]
We propose a hierarchical multi-label classification method based on semi-supervised learning of predictive clustering trees.
We also extend the method towards ensemble learning and propose a method based on the random forest approach.
arXiv Detail & Related papers (2022-07-19T12:49:00Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - Measure Inducing Classification and Regression Trees for Functional Data [0.0]
We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis.
This is achieved by learning a weighted functional $L2$ space by means of constrained convex optimization.
arXiv Detail & Related papers (2020-10-30T18:49:53Z) - Expectation propagation on the diluted Bayesian classifier [0.0]
We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification.
A computational scheme known as expectation propagation (EP) is used to train a continuous-weights perceptron learning a classification rule.
EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy and computational complexity.
arXiv Detail & Related papers (2020-09-20T23:59:44Z) - Structured Graph Learning for Clustering and Semi-supervised
Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data.
Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure.
Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.