Permutation Decision Trees
- URL: http://arxiv.org/abs/2306.02617v3
- Date: Fri, 31 May 2024 11:30:31 GMT
- Title: Permutation Decision Trees
- Authors: Harikrishnan N B, Arham Jain, Nithin Nagaraj,
- Abstract summary: Effort-To-Compress (ETC) is a complexity measure, for the first time, as an alternative impurity measure.
We conduct a performance comparison between Permutation Decision Trees and classical decision trees across various real-world datasets.
- Score: 3.089408984959925
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decision Tree is a well understood Machine Learning model that is based on minimizing impurities in the internal nodes. The most common impurity measures are Shannon entropy and Gini impurity. These impurity measures are insensitive to the order of training data and hence the final tree obtained is invariant to any permutation of the data. This is a limitation in terms of modeling when there are temporal order dependencies between data instances. In this research, we propose the adoption of Effort-To-Compress (ETC) - a complexity measure, for the first time, as an alternative impurity measure. Unlike Shannon entropy and Gini impurity, structural impurity based on ETC is able to capture order dependencies in the data, thus obtaining potentially different decision trees for different permutations of the same data instances, a concept we term as Permutation Decision Trees (PDT). We then introduce the notion of Permutation Bagging achieved using permutation decision trees without the need for random feature selection and sub-sampling. We conduct a performance comparison between Permutation Decision Trees and classical decision trees across various real-world datasets, including Appendicitis, Breast Cancer Wisconsin, Diabetes Pima Indian, Ionosphere, Iris, Sonar, and Wine. Our findings reveal that PDT demonstrates comparable performance to classical decision trees across most datasets. Remarkably, in certain instances, PDT even slightly surpasses the performance of classical decision trees. In comparing Permutation Bagging with Random Forest, we attain comparable performance to Random Forest models consisting of 50 to 1000 trees, using merely 21 trees. This highlights the efficiency and effectiveness of Permutation Bagging in achieving comparable performance outcomes with significantly fewer trees.
Related papers
- Terminating Differentiable Tree Experts [77.2443883991608]
We propose a neuro-symbolic Differentiable Tree Machine that learns tree operations using a combination of transformers and Representation Products.
We first remove a series of different transformer layers that are used in every step by introducing a mixture of experts.
We additionally propose a new termination algorithm to provide the model the power to choose how many steps to make automatically.
arXiv Detail & Related papers (2024-07-02T08:45:38Z) - Learning accurate and interpretable decision trees [27.203303726977616]
We develop approaches to design decision tree learning algorithms given repeated access to data from the same domain.
We study the sample complexity of tuning prior parameters in Bayesian decision tree learning, and extend our results to decision tree regression.
We also study the interpretability of the learned decision trees and introduce a data-driven approach for optimizing the explainability versus accuracy trade-off using decision trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - Bayesian Decision Trees via Tractable Priors and Probabilistic
Context-Free Grammars [7.259767735431625]
We propose a new criterion for training Bayesian Decision Trees.
BCART-PCFG can efficiently sample decision trees from a posterior distribution across trees given the data.
We find that trees sampled via BCART-PCFG perform comparable to or better than greedily-constructed Decision Trees.
arXiv Detail & Related papers (2023-02-15T00:17:41Z) - Regularized impurity reduction: Accurate decision trees with complexity
guarantees [20.170305081348328]
We propose a tree-induction algorithm that gives a logarithmic approximation guarantee on the tree complexity.
The enhanced algorithms strike an excellent balance between predictive accuracy and tree complexity.
arXiv Detail & Related papers (2022-08-23T13:15:19Z) - Hierarchical Shrinkage: improving the accuracy and interpretability of
tree-based methods [10.289846887751079]
We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure.
HS substantially increases the predictive performance of decision trees, even when used in conjunction with other regularization techniques.
All code and models are released in a full-fledged package available on Github.
arXiv Detail & Related papers (2022-02-02T02:43:23Z) - Optimal randomized classification trees [0.0]
Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning.
CARTs are built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold.
This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures.
arXiv Detail & Related papers (2021-10-19T11:41:12Z) - Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG)
We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product.
We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z) - SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.