Related papers: Oblique Predictive Clustering Trees

Oblique Predictive Clustering Trees

URL: http://arxiv.org/abs/2007.13617v2
Date: Thu, 5 Nov 2020 08:35:43 GMT
Title: Oblique Predictive Clustering Trees
Authors: Toma\v{z} Stepi\v{s}nik and Dragi Kocev
Abstract summary: Predictive clustering trees (PCTs) can be used to solve a variety of predictive modeling tasks, including structured output prediction. We propose oblique predictive clustering trees, capable of addressing these limitations. We experimentally evaluate the proposed methods on 60 benchmark datasets for 6 predictive modeling tasks.
Score: 6.317966126631351
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Predictive clustering trees (PCTs) are a well established generalization of standard decision trees, which can be used to solve a variety of predictive modeling tasks, including structured output prediction. Combining them into ensembles yields state-of-the-art performance. Furthermore, the ensembles of PCTs can be interpreted by calculating feature importance scores from the learned models. However, their learning time scales poorly with the dimensionality of the output space. This is often problematic, especially in (hierarchical) multi-label classification, where the output can consist of hundreds of potential labels. Also, learning of PCTs can not exploit the sparsity of data to improve the computational efficiency, which is common in both input (molecular fingerprints, bag of words representations) and output spaces (in multi-label classification, examples are often labeled with only a fraction of possible labels). In this paper, we propose oblique predictive clustering trees, capable of addressing these limitations. We design and implement two methods for learning oblique splits that contain linear combinations of features in the tests, hence a split corresponds to an arbitrary hyperplane in the input space. The methods are efficient for high dimensional data and capable of exploiting sparse data. We experimentally evaluate the proposed methods on 60 benchmark datasets for 6 predictive modeling tasks. The results of the experiments show that oblique predictive clustering trees achieve performance on-par with state-of-the-art methods and are orders of magnitude faster than standard PCTs. We also show that meaningful feature importance scores can be extracted from the models learned with the proposed methods.

Related papers

Graph-based Complexity for Causal Effect by Empirical Plug-in [56.14597641617531]
This paper focuses on the computational complexity of computing empirical plug-in estimates for causal effect queries. We show that computation can be done efficiently, potentially in time linear in the data size, depending on the estimand's hypergraph.
arXiv Detail & Related papers (2024-11-15T07:42:01Z)
A Closer Look at Deep Learning on Tabular Data [52.50778536274327]
Tabular data is prevalent across various domains in machine learning. Deep Neural Network (DNN)-based methods have shown promising performance comparable to tree-based ones.
arXiv Detail & Related papers (2024-07-01T04:24:07Z)
A Unified Approach to Extract Interpretable Rules from Tree Ensembles via Integer Programming [2.1408617023874443]
Tree ensemble methods are known for their effectiveness in supervised classification and regression tasks. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model.
arXiv Detail & Related papers (2024-06-30T22:33:47Z)
Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples. Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance. We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z)
Label Learning Method Based on Tensor Projection [82.51786483693206]
We propose a label learning method based on tensor projection (LLMTP) We extend the matrix projection transformation to tensor projection, so that the spatial structure information between views can be fully utilized. In addition, we introduce the tensor Schatten $p$-norm regularization to make the clustering label matrices of different views as consistent as possible.
arXiv Detail & Related papers (2024-02-26T13:03:26Z)
Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z)
Semi-supervised Predictive Clustering Trees for (Hierarchical) Multi-label Classification [2.706328351174805]
We propose a hierarchical multi-label classification method based on semi-supervised learning of predictive clustering trees. We also extend the method towards ensemble learning and propose a method based on the random forest approach.
arXiv Detail & Related papers (2022-07-19T12:49:00Z)
Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts. We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data. We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Measure Inducing Classification and Regression Trees for Functional Data [0.0]
We propose a tree-based algorithm for classification and regression problems in the context of functional data analysis. This is achieved by learning a weighted functional $L2$ space by means of constrained convex optimization.
arXiv Detail & Related papers (2020-10-30T18:49:53Z)
Expectation propagation on the diluted Bayesian classifier [0.0]
We introduce a statistical mechanics inspired strategy that addresses the problem of sparse feature selection in the context of binary classification. A computational scheme known as expectation propagation (EP) is used to train a continuous-weights perceptron learning a classification rule. EP is a robust and competitive algorithm in terms of variable selection properties, estimation accuracy and computational complexity.
arXiv Detail & Related papers (2020-09-20T23:59:44Z)
Structured Graph Learning for Clustering and Semi-supervised Classification [74.35376212789132]
We propose a graph learning framework to preserve both the local and global structure of data. Our method uses the self-expressiveness of samples to capture the global structure and adaptive neighbor approach to respect the local structure. Our model is equivalent to a combination of kernel k-means and k-means methods under certain condition.
arXiv Detail & Related papers (2020-08-31T08:41:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.