Regularized impurity reduction: Accurate decision trees with complexity
guarantees
- URL: http://arxiv.org/abs/2208.10949v1
- Date: Tue, 23 Aug 2022 13:15:19 GMT
- Title: Regularized impurity reduction: Accurate decision trees with complexity
guarantees
- Authors: Guangyi Zhang and Aristides Gionis
- Abstract summary: We propose a tree-induction algorithm that gives a logarithmic approximation guarantee on the tree complexity.
The enhanced algorithms strike an excellent balance between predictive accuracy and tree complexity.
- Score: 20.170305081348328
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decision trees are popular classification models, providing high accuracy and
intuitive explanations. However, as the tree size grows the model
interpretability deteriorates. Traditional tree-induction algorithms, such as
C4.5 and CART, rely on impurity-reduction functions that promote the
discriminative power of each split. Thus, although these traditional methods
are accurate in practice, there has been no theoretical guarantee that they
will produce small trees. In this paper, we justify the use of a general family
of impurity functions, including the popular functions of entropy and
Gini-index, in scenarios where small trees are desirable, by showing that a
simple enhancement can equip them with complexity guarantees. We consider a
general setting, where objects to be classified are drawn from an arbitrary
probability distribution, classification can be binary or multi-class, and
splitting tests are associated with non-uniform costs. As a measure of tree
complexity, we adopt the expected cost to classify an object drawn from the
input distribution, which, in the uniform-cost case, is the expected number of
tests. We propose a tree-induction algorithm that gives a logarithmic
approximation guarantee on the tree complexity. This approximation factor is
tight up to a constant factor under mild assumptions. The algorithm recursively
selects a test that maximizes a greedy criterion defined as a weighted sum of
three components. The first two components encourage the selection of tests
that improve the balance and the cost-efficiency of the tree, respectively,
while the third impurity-reduction component encourages the selection of more
discriminative tests. As shown in our empirical evaluation, compared to the
original heuristics, the enhanced algorithms strike an excellent balance
between predictive accuracy and tree complexity.
Related papers
- Separation and Collapse of Equilibria Inequalities on AND-OR Trees without Shape Constraints [0.0]
We investigate the zero-error randomized complexity, which is the least cost against the worst input, of AND-OR tree computation.
directional algorithms are known to achieve the randomized complexity.
We show that for any AND-OR tree, randomized depth-first algorithms, have the same equilibrium as that of the directional algorithms.
arXiv Detail & Related papers (2024-05-30T15:13:46Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Multi-rules mining algorithm for combinatorially exploded decision trees
with modified Aitchison-Aitken function-based Bayesian optimization [0.0]
"MAABO-MT" and "GS-MRM" algorithms that strategically construct trees with high estimation performance.
Experiments are conducted using several open datasets to analyze the effectiveness of the proposed method.
arXiv Detail & Related papers (2023-10-04T07:55:51Z) - Permutation Decision Trees [3.089408984959925]
Effort-To-Compress (ETC) is a complexity measure, for the first time, as an alternative impurity measure.
We conduct a performance comparison between Permutation Decision Trees and classical decision trees across various real-world datasets.
arXiv Detail & Related papers (2023-06-05T06:31:14Z) - Optimal randomized classification trees [0.0]
Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning.
CARTs are built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold.
This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures.
arXiv Detail & Related papers (2021-10-19T11:41:12Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG)
We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product.
We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z) - SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.