Related papers: Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality Based on Decision Trees as Data Observation Processes

Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality Based on Decision Trees as Data Observation Processes

URL: http://arxiv.org/abs/2306.07060v1
Date: Mon, 12 Jun 2023 12:14:57 GMT
Title: Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality Based on Decision Trees as Data Observation Processes
Authors: Yuta Nakahara, Shota Saito, Naoki Ichijo, Koki Kazama, Toshiyasu Matsushima
Abstract summary: This paper uses trees to represent data observation processes behind given data. We derive the statistically optimal prediction, which is robust against overfitting. We solve this by a Markov chain Monte Carlo method, whose step size is adaptively tuned according to a posterior distribution for the trees.
Score: 1.2774526936067927
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the field of decision trees, most previous studies have difficulty ensuring the statistical optimality of a prediction of new data and suffer from overfitting because trees are usually used only to represent prediction functions to be constructed from given data. In contrast, some studies, including this paper, used the trees to represent stochastic data observation processes behind given data. Moreover, they derived the statistically optimal prediction, which is robust against overfitting, based on the Bayesian decision theory by assuming a prior distribution for the trees. However, these studies still have a problem in computing this Bayes optimal prediction because it involves an infeasible summation for all division patterns of a feature space, which is represented by the trees and some parameters. In particular, an open problem is a summation with respect to combinations of division axes, i.e., the assignment of features to inner nodes of the tree. We solve this by a Markov chain Monte Carlo method, whose step size is adaptively tuned according to a posterior distribution for the trees.

Related papers

Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
Statistical Advantages of Oblique Randomized Decision Trees and Forests [0.0]
Generalization error and convergence rates are obtained for the flexible dimension reduction model class of ridge functions. A lower bound on the risk of axis-aligned Mondrian trees is obtained proving that these estimators are suboptimal for these linear dimension reduction models.
arXiv Detail & Related papers (2024-07-02T17:35:22Z)
Building Trees for Probabilistic Prediction via Scoring Rules [0.0]
We study modifying a tree to produce nonparametric predictive distributions. We find the standard method for building trees may not result in good predictive distributions. We propose changing the splitting criteria for trees to one based on proper scoring rules.
arXiv Detail & Related papers (2024-02-16T20:04:13Z)
Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees. We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z)
Social Interpretable Tree for Pedestrian Trajectory Prediction [75.81745697967608]
We propose a tree-based method, termed as Social Interpretable Tree (SIT), to address this multi-modal prediction task. A path in the tree from the root to leaf represents an individual possible future trajectory. Despite the hand-crafted tree, the experimental results on ETH-UCY and Stanford Drone datasets demonstrate that our method is capable of matching or exceeding the performance of state-of-the-art methods.
arXiv Detail & Related papers (2022-05-26T12:18:44Z)
Probability Distribution on Rooted Trees [1.3955252961896318]
hierarchical expressive capability of rooted trees is applicable to represent statistical models in various areas, such as data compression, image processing, and machine learning. One unified approach to solve this is a Bayesian approach, on which the rooted tree is regarded as a random variable and a direct loss function can be assumed on the selected model or the predicted value for a new data point. In this paper, we propose a generalized probability distribution for any rooted trees in which only the maximum number of child nodes and the maximum depth are fixed.
arXiv Detail & Related papers (2022-01-24T05:13:58Z)
Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression [51.770998056563094]
Probabilistic Gradient Boosting Machines (PGBM) is a method to create probabilistic predictions with a single ensemble of decision trees. We empirically demonstrate the advantages of PGBM compared to existing state-of-the-art methods.
arXiv Detail & Related papers (2021-06-03T08:32:13Z)
Achieving Reliable Causal Inference with Data-Mined Variables: A Random Forest Approach to the Measurement Error Problem [1.5749416770494704]
A common empirical strategy involves the application of predictive modeling techniques to'mine' variables of interest from available data. Recent work highlights that, because the predictions from machine learning models are inevitably imperfect, econometric analyses based on the predicted variables are likely to suffer from bias due to measurement error. We propose a novel approach to mitigate these biases, leveraging the ensemble learning technique known as the random forest.
arXiv Detail & Related papers (2020-12-19T21:48:23Z)
MurTree: Optimal Classification Trees via Dynamic Programming and Search [61.817059565926336]
We present a novel algorithm for learning optimal classification trees based on dynamic programming and search. Our approach uses only a fraction of the time required by the state-of-the-art and can handle datasets with tens of thousands of instances.
arXiv Detail & Related papers (2020-07-24T17:06:55Z)
Handling Missing Data in Decision Trees: A Probabilistic Approach [41.259097100704324]
We tackle the problem of handling missing data in decision trees by taking a probabilistic approach. We use tractable density estimators to compute the "expected prediction" of our models. At learning time, we fine-tune parameters of already learned trees by minimizing their "expected prediction loss"
arXiv Detail & Related papers (2020-06-29T19:54:54Z)
Generalized and Scalable Optimal Sparse Decision Trees [56.35541305670828]
We present techniques that produce optimal decision trees over a variety of objectives. We also introduce a scalable algorithm that produces provably optimal results in the presence of continuous variables.
arXiv Detail & Related papers (2020-06-15T19:00:11Z)
ENTMOOT: A Framework for Optimization over Ensemble Tree Models [57.98561336670884]
ENTMOOT is a framework for integrating tree models into larger optimization problems. We show how ENTMOOT allows a simple integration of tree models into decision-making and black-box optimization.
arXiv Detail & Related papers (2020-03-10T14:34:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.