Random Planted Forest: a directly interpretable tree ensemble
- URL: http://arxiv.org/abs/2012.14563v3
- Date: Thu, 3 Aug 2023 15:27:10 GMT
- Title: Random Planted Forest: a directly interpretable tree ensemble
- Authors: Munir Hiabu, Enno Mammen, Joseph T. Meyer
- Abstract summary: We introduce a novel interpretable tree based algorithm for prediction in a regression setting.
In a simulation study we find encouraging prediction and visualisation properties of our random planted forest method.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel interpretable tree based algorithm for prediction in a
regression setting. Our motivation is to estimate the unknown regression
function from a functional decomposition perspective in which the functional
components correspond to lower order interaction terms. The idea is to modify
the random forest algorithm by keeping certain leaves after they are split
instead of deleting them. This leads to non-binary trees which we refer to as
planted trees. An extension to a forest leads to our random planted forest
algorithm. Additionally, the maximum number of covariates which can interact
within a leaf can be bounded. If we set this interaction bound to one, the
resulting estimator is a sum of one-dimensional functions. In the other extreme
case, if we do not set a limit, the resulting estimator and corresponding model
place no restrictions on the form of the regression function. In a simulation
study we find encouraging prediction and visualisation properties of our random
planted forest method. We also develop theory for an idealized version of
random planted forests in cases where the interaction bound is low. We show
that if it is smaller than three, the idealized version achieves asymptotically
optimal convergence rates up to a logarithmic factor. Code is available on
GitHub https://github.com/PlantedML/randomPlantedForest.
Related papers
- Free Lunch in the Forest: Functionally-Identical Pruning of Boosted Tree Ensembles [45.962492329047215]
We introduce a method to prune a tree ensemble into a reduced version that is "functionally identical" to the original model.
We formalize the problem of functionally identical pruning on ensembles, introduce an exact optimization model, and provide a fast yet highly effective method to prune large ensembles.
arXiv Detail & Related papers (2024-08-28T23:15:46Z) - Adaptive Split Balancing for Optimal Random Forest [8.916614661563893]
We propose a new random forest algorithm that constructs the trees using a novel adaptive split-balancing method.
Our method achieves optimality in simple, smooth scenarios while adaptively learning the tree structure from the data.
arXiv Detail & Related papers (2024-02-17T09:10:40Z) - Inference with Mondrian Random Forests [6.97762648094816]
We give precise bias and variance characterizations, along with a Berry-Esseen-type central limit theorem, for the Mondrian random forest regression estimator.
We present valid statistical inference methods for the unknown regression function.
Efficient and implementable algorithms are devised for both batch and online learning settings.
arXiv Detail & Related papers (2023-10-15T01:41:42Z) - SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree
Search [68.66904039405871]
We introduce SoftTreeMax, a generalization of softmax that takes planning into account.
We show for the first time the role of a tree expansion policy in mitigating this variance.
Our differentiable tree-based policy leverages all gradients at the tree leaves in each environment step instead of the traditional single-sample-based gradient.
arXiv Detail & Related papers (2023-01-30T19:03:14Z) - An Approximation Method for Fitted Random Forests [0.0]
We study methods that approximate each fitted tree in the Random Forests model using the multinomial allocation of the data points to the leafs.
Specifically, we begin by studying whether fitting a multinomial logistic regression helps reduce the size while preserving the prediction quality.
arXiv Detail & Related papers (2022-07-05T17:28:52Z) - Explaining random forest prediction through diverse rulesets [0.0]
Local Tree eXtractor (LTreeX) is able to explain the forest prediction for a given test instance with a few diverse rules.
We show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.
arXiv Detail & Related papers (2022-03-29T12:54:57Z) - Generalised Boosted Forests [0.9899017174990579]
We start with an MLE-type estimate in the link space and then define generalised residuals from it.
We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest.
We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric.
arXiv Detail & Related papers (2021-02-24T21:17:31Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - An Efficient Adversarial Attack for Tree Ensembles [91.05779257472675]
adversarial attacks on tree based ensembles such as gradient boosting decision trees (DTs) and random forests (RFs)
We show that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach.
Our code is available at https://chong-z/tree-ensemble-attack.
arXiv Detail & Related papers (2020-10-22T10:59:49Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - Censored Quantile Regression Forest [81.9098291337097]
We develop a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring.
The proposed procedure named it censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption.
arXiv Detail & Related papers (2020-01-08T23:20:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.