Dive into Decision Trees and Forests: A Theoretical Demonstration
- URL: http://arxiv.org/abs/2101.08656v1
- Date: Wed, 20 Jan 2021 16:47:59 GMT
- Title: Dive into Decision Trees and Forests: A Theoretical Demonstration
- Authors: Jinxiong Zhang
- Abstract summary: Decision trees use the strategy of "divide-and-conquer" to divide a complex problem on the dependency between input features and labels into smaller ones.
Recent advances have greatly improved their performance in computational advertising, recommender system, information retrieval, etc.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Based on decision trees, many fields have arguably made tremendous progress
in recent years. In simple words, decision trees use the strategy of
"divide-and-conquer" to divide the complex problem on the dependency between
input features and labels into smaller ones. While decision trees have a long
history, recent advances have greatly improved their performance in
computational advertising, recommender system, information retrieval, etc. We
introduce common tree-based models (e.g., Bayesian CART, Bayesian regression
splines) and training techniques (e.g., mixed integer programming, alternating
optimization, gradient descent). Along the way, we highlight probabilistic
characteristics of tree-based models and explain their practical and
theoretical benefits. Except machine learning and data mining, we try to show
theoretical advances on tree-based models from other fields such as statistics
and operation research. We list the reproducible resource at the end of each
method.
Related papers
- Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a novel framework that utilizes large language models (LLMs) to identify effective feature generation rules.
We use decision trees to convey this reasoning information, as they can be easily represented in natural language.
OCTree consistently enhances the performance of various prediction models across diverse benchmarks.
arXiv Detail & Related papers (2024-06-12T08:31:34Z) - Learning accurate and interpretable decision trees [27.203303726977616]
We develop approaches to design decision tree learning algorithms given repeated access to data from the same domain.
We study the sample complexity of tuning prior parameters in Bayesian decision tree learning, and extend our results to decision tree regression.
We also study the interpretability of the learned decision trees and introduce a data-driven approach for optimizing the explainability versus accuracy trade-off using decision trees.
arXiv Detail & Related papers (2024-05-24T20:10:10Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained
Visual Categorization [56.37520969273242]
We introduce ViTree, a novel approach for fine-grained visual categorization.
By traversing the tree paths, ViTree effectively selects patches from transformer-processed features to highlight informative local regions.
This patch and path selectivity enhances model interpretability of ViTree, enabling better insights into the model's inner workings.
arXiv Detail & Related papers (2024-01-30T14:32:25Z) - MAPTree: Beating "Optimal" Decision Trees with Bayesian Decision Trees [2.421336072915701]
We present a Bayesian approach to decision tree induction via maximum a posteriori inference of a posterior distribution over trees.
We propose an AND/OR search algorithm, dubbed MAPTree, which is able to recover the maximum a posteriori tree.
arXiv Detail & Related papers (2023-09-26T23:43:37Z) - Bayesian Decision Trees via Tractable Priors and Probabilistic
Context-Free Grammars [7.259767735431625]
We propose a new criterion for training Bayesian Decision Trees.
BCART-PCFG can efficiently sample decision trees from a posterior distribution across trees given the data.
We find that trees sampled via BCART-PCFG perform comparable to or better than greedily-constructed Decision Trees.
arXiv Detail & Related papers (2023-02-15T00:17:41Z) - Optimal Sparse Regression Trees [24.03491277969824]
This work proposes a dynamic-programming-with-bounds approach to the construction of provably-optimal sparse regression trees.
We leverage a novel lower bound based on an optimal solution to the k-Means clustering algorithm in 1-dimension over the set of labels.
arXiv Detail & Related papers (2022-11-28T00:43:21Z) - Social Interpretable Tree for Pedestrian Trajectory Prediction [75.81745697967608]
We propose a tree-based method, termed as Social Interpretable Tree (SIT), to address this multi-modal prediction task.
A path in the tree from the root to leaf represents an individual possible future trajectory.
Despite the hand-crafted tree, the experimental results on ETH-UCY and Stanford Drone datasets demonstrate that our method is capable of matching or exceeding the performance of state-of-the-art methods.
arXiv Detail & Related papers (2022-05-26T12:18:44Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.