Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality
Based on Decision Trees as Data Observation Processes
- URL: http://arxiv.org/abs/2306.07060v1
- Date: Mon, 12 Jun 2023 12:14:57 GMT
- Title: Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality
Based on Decision Trees as Data Observation Processes
- Authors: Yuta Nakahara, Shota Saito, Naoki Ichijo, Koki Kazama, Toshiyasu
Matsushima
- Abstract summary: This paper uses trees to represent data observation processes behind given data.
We derive the statistically optimal prediction, which is robust against overfitting.
We solve this by a Markov chain Monte Carlo method, whose step size is adaptively tuned according to a posterior distribution for the trees.
- Score: 1.2774526936067927
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the field of decision trees, most previous studies have difficulty
ensuring the statistical optimality of a prediction of new data and suffer from
overfitting because trees are usually used only to represent prediction
functions to be constructed from given data. In contrast, some studies,
including this paper, used the trees to represent stochastic data observation
processes behind given data. Moreover, they derived the statistically optimal
prediction, which is robust against overfitting, based on the Bayesian decision
theory by assuming a prior distribution for the trees. However, these studies
still have a problem in computing this Bayes optimal prediction because it
involves an infeasible summation for all division patterns of a feature space,
which is represented by the trees and some parameters. In particular, an open
problem is a summation with respect to combinations of division axes, i.e., the
assignment of features to inner nodes of the tree. We solve this by a Markov
chain Monte Carlo method, whose step size is adaptively tuned according to a
posterior distribution for the trees.
Related papers
- Statistical Advantages of Oblique Randomized Decision Trees and Forests [0.0]
Generalization error and convergence rates are obtained for the flexible dimension reduction model class of ridge functions.
A lower bound on the risk of axis-aligned Mondrian trees is obtained proving that these estimators are suboptimal for these linear dimension reduction models.
arXiv Detail & Related papers (2024-07-02T17:35:22Z) - Building Trees for Probabilistic Prediction via Scoring Rules [0.0]
We study modifying a tree to produce nonparametric predictive distributions.
We find the standard method for building trees may not result in good predictive distributions.
We propose changing the splitting criteria for trees to one based on proper scoring rules.
arXiv Detail & Related papers (2024-02-16T20:04:13Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - Social Interpretable Tree for Pedestrian Trajectory Prediction [75.81745697967608]
We propose a tree-based method, termed as Social Interpretable Tree (SIT), to address this multi-modal prediction task.
A path in the tree from the root to leaf represents an individual possible future trajectory.
Despite the hand-crafted tree, the experimental results on ETH-UCY and Stanford Drone datasets demonstrate that our method is capable of matching or exceeding the performance of state-of-the-art methods.
arXiv Detail & Related papers (2022-05-26T12:18:44Z) - Probability Distribution on Rooted Trees [1.3955252961896318]
hierarchical expressive capability of rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning.
One unified approach to solve this is a Bayesian approach, on which the rooted tree is regarded as a random variable and a direct loss function can be assumed on the selected model or the predicted value for a new data point.
In this paper, we propose a generalized probability distribution for any rooted trees in which only the maximum number of child nodes and the maximum depth are fixed.
arXiv Detail & Related papers (2022-01-24T05:13:58Z) - Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic
Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees.
We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z) - Achieving Reliable Causal Inference with Data-Mined Variables: A Random
Forest Approach to the Measurement Error Problem [1.5749416770494704]
A common empirical strategy involves the application of predictive modeling techniques to'mine' variables of interest from available data.
Recent work highlights that, because the predictions from machine learning models are inevitably imperfect, econometric analyses based on the predicted variables are likely to suffer from bias due to measurement error.
We propose a novel approach to mitigate these biases, leveraging the ensemble learning technique known as the random forest.
arXiv Detail & Related papers (2020-12-19T21:48:23Z) - MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search.
Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z) - Handling Missing Data in Decision Trees: A Probabilistic Approach [41.259097100704324]
We tackle the problem of handling missing data in decision trees by taking a probabilistic approach.
We use tractable density estimators to compute the "expected prediction" of our models.
At learning time, we fine-tune parameters of already learned trees by minimizing their "expected prediction loss"
arXiv Detail & Related papers (2020-06-29T19:54:54Z) - Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives.
We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z) - ENTMOOT: A Framework for Optimization over Ensemble Tree Models [57.98561336670884]
ENTMOOT is a framework for integrating tree models into larger optimization problems.
We show how ENTMOOT allows a simple integration of tree models into decision-making and black-box optimization.
arXiv Detail & Related papers (2020-03-10T14:34:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.