SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules
- URL: http://arxiv.org/abs/2501.09900v1
- Date: Fri, 17 Jan 2025 01:13:44 GMT
- Title: SBAMDT: Bayesian Additive Decision Trees with Adaptive Soft Semi-multivariate Split Rules
- Authors: Stamatina Lamprinakou, Huiyan Sang, Bledar A. Konomi, Ligang Lu,
- Abstract summary: We introduce a novel probabilistic additive decision tree model that employs a soft split rule.
We demonstrate the utility of the proposed model through comparisons with existing tree-based models on synthetic datasets and a New York City education dataset.
- Score: 7.324728751991983
- License:
- Abstract: Bayesian Additive Regression Trees [BART, Chipman et al., 2010] have gained significant popularity due to their remarkable predictive performance and ability to quantify uncertainty. However, standard decision tree models rely on recursive data splits at each decision node, using deterministic decision rules based on a single univariate feature. This approach limits their ability to effectively capture complex decision boundaries, particularly in scenarios involving multiple features, such as spatial domains, or when transitions are either sharp or smoothly varying. In this paper, we introduce a novel probabilistic additive decision tree model that employs a soft split rule. This method enables highly flexible splits that leverage both univariate and multivariate features, while also respecting the geometric properties of the feature domain. Notably, the probabilistic split rule adapts dynamically across decision nodes, allowing the model to account for varying levels of smoothness in the regression function. We demonstrate the utility of the proposed model through comparisons with existing tree-based models on synthetic datasets and a New York City education dataset.
Related papers
- Structural Entropy Guided Probabilistic Coding [52.01765333755793]
We propose a novel structural entropy-guided probabilistic coding model, named SEPC.
We incorporate the relationship between latent variables into the optimization by proposing a structural entropy regularization loss.
Experimental results across 12 natural language understanding tasks, including both classification and regression tasks, demonstrate the superior performance of SEPC.
arXiv Detail & Related papers (2024-12-12T00:37:53Z) - SMART: A Flexible Approach to Regression using Spline-Based Multivariate Adaptive Regression Trees [0.0]
Decision trees are powerful for predictive modeling but often suffer from high variance when modeling continuous relationships.
We introduce Spline-based Multivariate Adaptive Regression Trees (MARS), which uses a decision tree to identify subsets of data with distinct continuous relationships.
MARS's native ability to handle higher-order terms allows the tree to focus solely on identifying discontinuities in the relationship.
arXiv Detail & Related papers (2024-10-08T01:18:08Z) - Statistical Advantages of Oblique Randomized Decision Trees and Forests [0.0]
Generalization error and convergence rates are obtained for the flexible dimension reduction model class of ridge functions.
A lower bound on the risk of axis-aligned Mondrian trees is obtained proving that these estimators are suboptimal for these linear dimension reduction models.
arXiv Detail & Related papers (2024-07-02T17:35:22Z) - Distributional Adaptive Soft Regression Trees [0.0]
This article proposes a new type of a distributional regression tree using a multivariate soft split rule.
One great advantage of the soft split is that smooth high-dimensional functions can be estimated with only one tree.
We show by means of extensive simulation studies that the algorithm has excellent properties and outperforms various benchmark methods.
arXiv Detail & Related papers (2022-10-19T08:59:02Z) - TreeFlow: Going beyond Tree-based Gaussian Probabilistic Regression [0.0]
We introduce TreeFlow, the tree-based approach that combines the benefits of using tree ensembles with the capabilities of modeling flexible probability distributions.
We evaluate the proposed method on challenging regression benchmarks with varying volume, feature characteristics, and target dimensionality.
arXiv Detail & Related papers (2022-06-08T20:06:23Z) - Nonparametric Variable Screening with Optimal Decision Stumps [19.493449206135296]
We derive finite sample performance guarantees for variable selection in nonparametric models using a single-level CART decision tree.
Unlike previous marginal screening methods that attempt to directly estimate each marginal projection via a truncated basis expansion, the fitted model used here is a simple, parsimonious decision stump.
arXiv Detail & Related papers (2020-11-05T06:56:12Z) - Probabilistic Circuits for Variational Inference in Discrete Graphical
Models [101.28528515775842]
Inference in discrete graphical models with variational methods is difficult.
Many sampling-based methods have been proposed for estimating Evidence Lower Bound (ELBO)
We propose a new approach that leverages the tractability of probabilistic circuit models, such as Sum Product Networks (SPN)
We show that selective-SPNs are suitable as an expressive variational distribution, and prove that when the log-density of the target model is aweighted the corresponding ELBO can be computed analytically.
arXiv Detail & Related papers (2020-10-22T05:04:38Z) - Convex Polytope Trees [57.56078843831244]
convex polytope trees (CPT) are proposed to expand the family of decision trees by an interpretable generalization of their decision boundary.
We develop a greedy method to efficiently construct CPT and scalable end-to-end training algorithms for the tree parameters when the tree structure is given.
arXiv Detail & Related papers (2020-10-21T19:38:57Z) - Identification of Probability weighted ARX models with arbitrary domains [75.91002178647165]
PieceWise Affine models guarantees universal approximation, local linearity and equivalence to other classes of hybrid system.
In this work, we focus on the identification of PieceWise Auto Regressive with eXogenous input models with arbitrary regions (NPWARX)
The architecture is conceived following the Mixture of Expert concept, developed within the machine learning field.
arXiv Detail & Related papers (2020-09-29T12:50:33Z) - Slice Sampling for General Completely Random Measures [74.24975039689893]
We present a novel Markov chain Monte Carlo algorithm for posterior inference that adaptively sets the truncation level using auxiliary slice variables.
The efficacy of the proposed algorithm is evaluated on several popular nonparametric models.
arXiv Detail & Related papers (2020-06-24T17:53:53Z) - Decision-Making with Auto-Encoding Variational Bayes [71.44735417472043]
We show that a posterior approximation distinct from the variational distribution should be used for making decisions.
Motivated by these theoretical results, we propose learning several approximate proposals for the best model.
In addition to toy examples, we present a full-fledged case study of single-cell RNA sequencing.
arXiv Detail & Related papers (2020-02-17T19:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.