Succinct Explanations With Cascading Decision Trees
- URL: http://arxiv.org/abs/2010.06631v1
- Date: Tue, 13 Oct 2020 18:48:39 GMT
- Title: Succinct Explanations With Cascading Decision Trees
- Authors: Jialu Zhang, Mark Santolucito and Ruzica Piskac
- Abstract summary: We propose a new decision tree model that we call Cascading Decision Trees.
Our key insight is to separate the notion of a decision path and an explanation path.
Applying cascading decision trees to new samples results in a significantly shorter and succinct explanation.
- Score: 5.877164140116815
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Classic decision tree learning is a binary classification algorithm that
constructs models with first-class transparency - every classification has a
directly derivable explanation. However, learning decision trees on modern
datasets generates large trees, which in turn generate decision paths of
excessive depth, obscuring the explanation of classifications. To improve the
comprehensibility of classifications, we propose a new decision tree model that
we call Cascading Decision Trees. Cascading Decision Trees shorten the size of
explanations of classifications, without sacrificing model performance overall.
Our key insight is to separate the notion of a decision path and an explanation
path. Utilizing this insight, instead of having one monolithic decision tree,
we build several smaller decision subtrees and cascade them in sequence. Our
cascading decision subtrees are designed to specifically target explanations
for positive classifications. This way each subtree identifies the smallest set
of features that can classify as many positive samples as possible, without
misclassifying any negative samples. Applying cascading decision trees to new
samples results in a significantly shorter and succinct explanation, if one of
the subtrees detects a positive classification. In that case, we immediately
stop and report the decision path of only the current subtree to the user as an
explanation for the classification. We evaluate our algorithm on standard
datasets, as well as new real-world applications and find that our model
shortens the explanation depth by over 40.8% for positive classifications
compared to the classic decision tree model.
Related papers
- Decision Trees for Interpretable Clusters in Mixture Models and Deep Representations [5.65604054654671]
We introduce the notion of an explainability-to-noise ratio for mixture models.
We propose an algorithm that takes as input a mixture model and constructs a suitable tree in data-independent time.
We prove upper and lower bounds on the error rate of the resulting decision tree.
arXiv Detail & Related papers (2024-11-03T14:00:20Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - Optimal randomized classification trees [0.0]
Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning.
CARTs are built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold.
This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures.
arXiv Detail & Related papers (2021-10-19T11:41:12Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Tree in Tree: from Decision Trees to Decision Graphs [2.2336243882030025]
Tree in Tree decision graph (TnT) is a framework that extends the conventional decision tree to a more generic and powerful directed acyclic graph.
Our proposed model is a novel, more efficient, and accurate alternative to the widely-used decision trees.
arXiv Detail & Related papers (2021-10-01T13:20:05Z) - Making CNNs Interpretable by Building Dynamic Sequential Decision
Forests with Top-down Hierarchy Learning [62.82046926149371]
We propose a generic model transfer scheme to make Convlutional Neural Networks (CNNs) interpretable.
We achieve this by building a differentiable decision forest on top of CNNs.
We name the transferred model deep Dynamic Sequential Decision Forest (dDSDF)
arXiv Detail & Related papers (2021-06-05T07:41:18Z) - Robustifying Algorithms of Learning Latent Trees with Vector Variables [92.18777020401484]
We present the sample complexities of Recursive Grouping (RG) and Chow-Liu Recursive Grouping (CLRG)
We robustify RG, CLRG, Neighbor Joining (NJ) and Spectral NJ (SNJ) by using the truncated inner product.
We derive the first known instance-dependent impossibility result for structure learning of latent trees.
arXiv Detail & Related papers (2021-06-02T01:37:52Z) - Rectified Decision Trees: Exploring the Landscape of Interpretable and
Effective Machine Learning [66.01622034708319]
We propose a knowledge distillation based decision trees extension, dubbed rectified decision trees (ReDT)
We extend the splitting criteria and the ending condition of the standard decision trees, which allows training with soft labels.
We then train the ReDT based on the soft label distilled from a well-trained teacher model through a novel jackknife-based method.
arXiv Detail & Related papers (2020-08-21T10:45:25Z) - Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes.
We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution.
Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - The Max-Cut Decision Tree: Improving on the Accuracy and Running Time of
Decision Trees [0.0]
The Max-Cut decision tree involves novel modifications to a standard, baseline model of classification decision tree construction, precisely CART Gini.
Our experiments show that this node-based localized PCA can dramatically improve classification, while also significantly decreasing computational time compared to the baseline decision tree.
Results are most significant when evaluated on data sets with higher dimensions, or more classes; which, for the example data set CIFAR-100, enable a 49% improvement in accuracy while reducing CPU time by 94%.
arXiv Detail & Related papers (2020-06-25T00:47:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.