Partition Trees: Conditional Density Estimation over General Outcome Spaces
- URL: http://arxiv.org/abs/2602.04042v1
- Date: Tue, 03 Feb 2026 22:12:30 GMT
- Title: Partition Trees: Conditional Density Estimation over General Outcome Spaces
- Authors: Felipe Angelim, Alessandro Leite,
- Abstract summary: We propose Partition Trees, a tree-based framework for conditional density estimation over general outcome spaces.<n>Our approach models conditional distributions as piecewise-constant densities on data adaptive partitions and learns trees by directly minimizing conditional negative log-likelihood.
- Score: 46.1988967916659
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose Partition Trees, a tree-based framework for conditional density estimation over general outcome spaces, supporting both continuous and categorical variables within a unified formulation. Our approach models conditional distributions as piecewise-constant densities on data adaptive partitions and learns trees by directly minimizing conditional negative log-likelihood. This yields a scalable, nonparametric alternative to existing probabilistic trees that does not make parametric assumptions about the target distribution. We further introduce Partition Forests, an ensemble extension obtained by averaging conditional densities. Empirically, we demonstrate improved probabilistic prediction over CART-style trees and competitive or superior performance compared to state-of-the-art probabilistic tree methods and Random Forests, along with robustness to redundant features and heteroscedastic noise.
Related papers
- TreeGrad-Ranker: Feature Ranking via $O(L)$-Time Gradients for Decision Trees [73.0940890296463]
probabilistic values are used to rank features for explaining local predicted values of decision trees.<n>TreeGrad computes the gradients of the multilinear extension of the joint objective in $O(L)$ time for decision trees with $L$ leaves.<n>TreeGrad-Ranker aggregates the gradients while optimizing the joint objective to produce feature rankings.<n>TreeGrad-Shap is a numerically stable algorithm for computing Beta Shapley values with integral parameters.
arXiv Detail & Related papers (2026-02-12T06:17:12Z) - Consistency of Honest Decision Trees and Random Forests [0.0]
We study various types of consistency of honest decision trees and random forests in the regression setting.<n>We establish weak and almost sure convergence of honest trees and honest forest averages to the true regression function.
arXiv Detail & Related papers (2026-01-21T13:40:36Z) - Clustered random forests with correlated data for optimal estimation and inference under potential covariate shift [4.13592995550836]
We develop Clustered Random Forests, a random forests algorithm for clustered data, arising from independent groups that exhibit within-cluster dependence.<n>The leaf-wise predictions for each decision tree making up clustered random forests takes the form of a weighted least squares estimator.<n>Clustered random forests are shown for certain tree splitting criteria to be minimax rate optimal for pointwise conditional mean estimation.
arXiv Detail & Related papers (2025-03-16T20:07:23Z) - Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data.<n>We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z) - Variational phylogenetic inference with products over bipartitions [48.2982114295171]
We present a novel variational family based on coalescent times of a single-linkage clustering and derive a closed-form density of the resulting distribution over trees.<n>Our method performs inference over all of tree space, it does not require any Markov chain Monte Carlo subroutines, and our variational family is differentiable.
arXiv Detail & Related papers (2025-02-21T00:06:57Z) - A partial likelihood approach to tree-based density modeling and its application in Bayesian inference [3.401207704599407]
Tree-based priors for probability distributions are usually specified using a predetermined, data-independent collection of candidate partitions of the sample space.<n>To characterize an unknown target density in detail over the entire sample space, candidate partitions must have the capacity to expand deeply into all areas of the sample space with potential non-zero sampling probability.<n>Traditional wisdom suggests that this compromise is inevitable to ensure coherent likelihood-based reasoning in Bayesian inference.<n>We propose a simple strategy to restore coherency while allowing the candidate partitions to be data-dependent, using Cox's partial likelihood.
arXiv Detail & Related papers (2024-12-16T12:10:23Z) - Ensembles of Probabilistic Regression Trees [46.53457774230618]
Tree-based ensemble methods have been successfully used for regression problems in many applications and research studies.
We study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution.
arXiv Detail & Related papers (2024-06-20T06:51:51Z) - Building Trees for Probabilistic Prediction via Scoring Rules [0.0]
We study modifying a tree to produce nonparametric predictive distributions.
We find the standard method for building trees may not result in good predictive distributions.
We propose changing the splitting criteria for trees to one based on proper scoring rules.
arXiv Detail & Related papers (2024-02-16T20:04:13Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Distributional Gradient Boosting Machines [77.34726150561087]
Our framework is based on XGBoost and LightGBM.
We show that our framework achieves state-of-the-art forecast accuracy.
arXiv Detail & Related papers (2022-04-02T06:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.