Random Planted Forest: a directly interpretable tree ensemble
- URL: http://arxiv.org/abs/2012.14563v3
- Date: Thu, 3 Aug 2023 15:27:10 GMT
- Title: Random Planted Forest: a directly interpretable tree ensemble
- Authors: Munir Hiabu, Enno Mammen, Joseph T. Meyer
- Abstract summary: We introduce a novel interpretable tree based algorithm for prediction in a regression setting.
In a simulation study we find encouraging prediction and visualisation properties of our random planted forest method.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a novel interpretable tree based algorithm for prediction in a
regression setting. Our motivation is to estimate the unknown regression
function from a functional decomposition perspective in which the functional
components correspond to lower order interaction terms. The idea is to modify
the random forest algorithm by keeping certain leaves after they are split
instead of deleting them. This leads to non-binary trees which we refer to as
planted trees. An extension to a forest leads to our random planted forest
algorithm. Additionally, the maximum number of covariates which can interact
within a leaf can be bounded. If we set this interaction bound to one, the
resulting estimator is a sum of one-dimensional functions. In the other extreme
case, if we do not set a limit, the resulting estimator and corresponding model
place no restrictions on the form of the regression function. In a simulation
study we find encouraging prediction and visualisation properties of our random
planted forest method. We also develop theory for an idealized version of
random planted forests in cases where the interaction bound is low. We show
that if it is smaller than three, the idealized version achieves asymptotically
optimal convergence rates up to a logarithmic factor. Code is available on
GitHub https://github.com/PlantedML/randomPlantedForest.
Related papers
- A Pseudo-Semantic Loss for Autoregressive Models with Logical
Constraints [87.08677547257733]
Neuro-symbolic AI bridges the gap between purely symbolic and neural approaches to learning.
We show how to maximize the likelihood of a symbolic constraint w.r.t the neural network's output distribution.
We also evaluate our approach on Sudoku and shortest-path prediction cast as autoregressive generation.
arXiv Detail & Related papers (2023-12-06T20:58:07Z) - SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree
Search [68.66904039405871]
We introduce SoftTreeMax, a generalization of softmax that takes planning into account.
We show for the first time the role of a tree expansion policy in mitigating this variance.
Our differentiable tree-based policy leverages all gradients at the tree leaves in each environment step instead of the traditional single-sample-based gradient.
arXiv Detail & Related papers (2023-01-30T19:03:14Z) - Distributional Adaptive Soft Regression Trees [0.0]
This article proposes a new type of a distributional regression tree using a multivariate soft split rule.
One great advantage of the soft split is that smooth high-dimensional functions can be estimated with only one tree.
We show by means of extensive simulation studies that the algorithm has excellent properties and outperforms various benchmark methods.
arXiv Detail & Related papers (2022-10-19T08:59:02Z) - An Approximation Method for Fitted Random Forests [0.0]
We study methods that approximate each fitted tree in the Random Forests model using the multinomial allocation of the data points to the leafs.
Specifically, we begin by studying whether fitting a multinomial logistic regression helps reduce the size while preserving the prediction quality.
arXiv Detail & Related papers (2022-07-05T17:28:52Z) - Explaining random forest prediction through diverse rulesets [0.0]
Local Tree eXtractor (LTreeX) is able to explain the forest prediction for a given test instance with a few diverse rules.
We show that our proposed approach substantially outperforms other explainable methods in terms of predictive performance.
arXiv Detail & Related papers (2022-03-29T12:54:57Z) - Minimax Rates for High-Dimensional Random Tessellation Forests [0.0]
Mondrian forests is the first class of random forests for which minimax rates were obtained in arbitrary dimension.
We show that a large class of random forests with general split directions also achieve minimax optimal convergence rates in arbitrary dimension.
arXiv Detail & Related papers (2021-09-22T06:47:38Z) - Generalised Boosted Forests [0.9899017174990579]
We start with an MLE-type estimate in the link space and then define generalised residuals from it.
We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest.
We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric.
arXiv Detail & Related papers (2021-02-24T21:17:31Z) - Growing Deep Forests Efficiently with Soft Routing and Learned
Connectivity [79.83903179393164]
This paper further extends the deep forest idea in several important aspects.
We employ a probabilistic tree whose nodes make probabilistic routing decisions, a.k.a., soft routing, rather than hard binary decisions.
Experiments on the MNIST dataset demonstrate that our empowered deep forests can achieve better or comparable performance than [1],[3].
arXiv Detail & Related papers (2020-12-29T18:05:05Z) - An Efficient Adversarial Attack for Tree Ensembles [91.05779257472675]
adversarial attacks on tree based ensembles such as gradient boosting decision trees (DTs) and random forests (RFs)
We show that our method can be thousands of times faster than the previous mixed-integer linear programming (MILP) based approach.
Our code is available at https://chong-z/tree-ensemble-attack.
arXiv Detail & Related papers (2020-10-22T10:59:49Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - Censored Quantile Regression Forest [81.9098291337097]
We develop a new estimating equation that adapts to censoring and leads to quantile score whenever the data do not exhibit censoring.
The proposed procedure named it censored quantile regression forest, allows us to estimate quantiles of time-to-event without any parametric modeling assumption.
arXiv Detail & Related papers (2020-01-08T23:20:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.