The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of
Decision Trees
- URL: http://arxiv.org/abs/2006.14118v1
- Date: Thu, 25 Jun 2020 00:47:21 GMT
- Title: The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of
Decision Trees
- Authors: Jonathan Bodine and Dorit S. Hochbaum
- Abstract summary: The Max-Cut decision tree involves novel modifications to a standard, baseline model of classification decision tree construction, precisely CART Gini.
Our experiments show that this node-based localized PCA can dramatically improve classification, while also significantly decreasing computational time compared to the baseline decision tree.
Results are most significant when evaluated on data sets with higher dimensions, or more classes; which, for the example data set CIFAR-100, enable a 49% improvement in accuracy while reducing CPU time by 94%.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decision trees are a widely used method for classification, both by
themselves and as the building blocks of multiple different ensemble learning
methods. The Max-Cut decision tree involves novel modifications to a standard,
baseline model of classification decision tree construction, precisely CART
Gini. One modification involves an alternative splitting metric, maximum cut,
based on maximizing the distance between all pairs of observations belonging to
separate classes and separate sides of the threshold value. The other
modification is to select the decision feature from a linear combination of the
input features constructed using Principal Component Analysis (PCA) locally at
each node. Our experiments show that this node-based localized PCA with the
novel splitting modification can dramatically improve classification, while
also significantly decreasing computational time compared to the baseline
decision tree. Moreover, our results are most significant when evaluated on
data sets with higher dimensions, or more classes; which, for the example data
set CIFAR-100, enable a 49% improvement in accuracy while reducing CPU time by
94%. These introduced modifications dramatically advance the capabilities of
decision trees for difficult classification tasks.
Related papers
- FoLDTree: A ULDA-Based Decision Tree Framework for Efficient Oblique Splits and Feature Selection [6.087464679182875]
LDATree and FoLDTree integrate Uncorrelated Linear Discriminant Analysis (ULDA) and Forward ULDA into a decision tree structure.
LDATree and FoLDTree consistently outperform axis-orthogonal and other oblique decision tree methods.
arXiv Detail & Related papers (2024-10-30T16:03:51Z) - Learning accurate and interpretable decision trees [27.203303726977616]
We develop approaches to design decision tree learning algorithms given repeated access to data from the same domain.
We study the sample complexity of tuning prior parameters in Bayesian decision tree learning, and extend our results to decision tree regression.
We also study the interpretability of the learned decision trees and introduce a data-driven approach for optimizing the explainability versus accuracy trade-off using decision trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - bsnsing: A decision tree induction method based on recursive optimal
boolean rule composition [2.28438857884398]
This paper proposes a new mixed-integer programming (MIP) formulation to optimize split rule selection in the decision tree induction process.
It develops an efficient search solver that is able to solve practical instances faster than commercial solvers.
arXiv Detail & Related papers (2022-05-30T17:13:57Z) - Optimal Decision Diagrams for Classification [68.72078059880018]
We study the training of optimal decision diagrams from a mathematical programming perspective.
We introduce a novel mixed-integer linear programming model for training.
We show how this model can be easily extended for fairness, parsimony, and stability notions.
arXiv Detail & Related papers (2022-05-28T18:31:23Z) - Optimal randomized classification trees [0.0]
Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning.
CARTs are built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold.
This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures.
arXiv Detail & Related papers (2021-10-19T11:41:12Z) - Making CNNs Interpretable by Building Dynamic Sequential Decision
Forests with Top-down Hierarchy Learning [62.82046926149371]
We propose a generic model transfer scheme to make Convlutional Neural Networks (CNNs) interpretable.
We achieve this by building a differentiable decision forest on top of CNNs.
We name the transferred model deep Dynamic Sequential Decision Forest (dDSDF)
arXiv Detail & Related papers (2021-06-05T07:41:18Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - Succinct Explanations With Cascading Decision Trees [5.877164140116815]
We propose a new decision tree model that we call Cascading Decision Trees.
Our key insight is to separate the notion of a decision path and an explanation path.
Applying cascading decision trees to new samples results in a significantly shorter and succinct explanation.
arXiv Detail & Related papers (2020-10-13T18:48:39Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.