Individualized and Global Feature Attributions for Gradient Boosted
Trees in the Presence of $\ell_2$ Regularization
- URL: http://arxiv.org/abs/2211.04409v1
- Date: Tue, 8 Nov 2022 17:56:22 GMT
- Title: Individualized and Global Feature Attributions for Gradient Boosted
Trees in the Presence of $\ell_2$ Regularization
- Authors: Qingyao Sun (University of Chicago)
- Abstract summary: We propose Prediction Decomposition (PreDecomp), a novel individualized feature attribution for boosted trees when they are trained with $ell$ regularization.
We also propose TreeInner, a family of debiased global feature attributions defined in terms of the inner product between any individualized feature attribution and labels on out-sample data for each tree.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While $\ell_2$ regularization is widely used in training gradient boosted
trees, popular individualized feature attribution methods for trees such as
Saabas and TreeSHAP overlook the training procedure. We propose Prediction
Decomposition Attribution (PreDecomp), a novel individualized feature
attribution for gradient boosted trees when they are trained with $\ell_2$
regularization. Theoretical analysis shows that the inner product between
PreDecomp and labels on in-sample data is essentially the total gain of a tree,
and that it can faithfully recover additive models in the population case when
features are independent. Inspired by the connection between PreDecomp and
total gain, we also propose TreeInner, a family of debiased global feature
attributions defined in terms of the inner product between any individualized
feature attribution and labels on out-sample data for each tree. Numerical
experiments on a simulated dataset and a genomic ChIP dataset show that
TreeInner has state-of-the-art feature selection performance. Code reproducing
experiments is available at https://github.com/nalzok/TreeInner .
Related papers
- Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models.
By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity.
In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z) - Learning a Decision Tree Algorithm with Transformers [75.96920867382859]
We introduce MetaTree, a transformer-based model trained via meta-learning to directly produce strong decision trees.
We fit both greedy decision trees and globally optimized decision trees on a large number of datasets, and train MetaTree to produce only the trees that achieve strong generalization performance.
arXiv Detail & Related papers (2024-02-06T07:40:53Z) - Tree Variational Autoencoders [5.992683455757179]
We propose a new generative hierarchical clustering model that learns a flexible tree-based posterior distribution over latent variables.
TreeVAE hierarchically divides samples according to their intrinsic characteristics, shedding light on hidden structures in the data.
arXiv Detail & Related papers (2023-06-15T09:25:04Z) - SoftTreeMax: Exponential Variance Reduction in Policy Gradient via Tree
Search [68.66904039405871]
We introduce SoftTreeMax, a generalization of softmax that takes planning into account.
We show for the first time the role of a tree expansion policy in mitigating this variance.
Our differentiable tree-based policy leverages all gradients at the tree leaves in each environment step instead of the traditional single-sample-based gradient.
arXiv Detail & Related papers (2023-01-30T19:03:14Z) - SETAR-Tree: A Novel and Accurate Tree Algorithm for Global Time Series
Forecasting [7.206754802573034]
In this paper, we explore the close connections between TAR models and regression trees.
We introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves.
In our evaluation, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms.
arXiv Detail & Related papers (2022-11-16T04:30:42Z) - Hierarchical Shrinkage: improving the accuracy and interpretability of
tree-based methods [10.289846887751079]
We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure.
HS substantially increases the predictive performance of decision trees, even when used in conjunction with other regularization techniques.
All code and models are released in a full-fledged package available on Github.
arXiv Detail & Related papers (2022-02-02T02:43:23Z) - Spectral Top-Down Recovery of Latent Tree Models [13.681975313065477]
Spectral Top-Down Recovery (STDR) is a divide-and-conquer approach for inference of large latent tree models.
STDR's partitioning step is non-random. Instead, it is based on the Fiedler vector of a suitable Laplacian matrix related to the observed nodes.
We prove that STDR is statistically consistent, and bound the number of samples required to accurately recover the tree with high probability.
arXiv Detail & Related papers (2021-02-26T02:47:42Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - SGA: A Robust Algorithm for Partial Recovery of Tree-Structured
Graphical Models with Noisy Samples [75.32013242448151]
We consider learning Ising tree models when the observations from the nodes are corrupted by independent but non-identically distributed noise.
Katiyar et al. (2020) showed that although the exact tree structure cannot be recovered, one can recover a partial tree structure.
We propose Symmetrized Geometric Averaging (SGA), a more statistically robust algorithm for partial tree recovery.
arXiv Detail & Related papers (2021-01-22T01:57:35Z) - Recursive Top-Down Production for Sentence Generation with Latent Trees [77.56794870399288]
We model the production property of context-free grammars for natural and synthetic languages.
We present a dynamic programming algorithm that marginalises over latent binary tree structures with $N$ leaves.
We also present experimental results on German-English translation on the Multi30k dataset.
arXiv Detail & Related papers (2020-10-09T17:47:16Z) - FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data
With Correlated Features [2.00191482700544]
FREEtree is a tree-based method for high dimensional longitudinal data with correlated features.
It exploits the network structure of the features by first clustering them using weighted correlation network analysis.
It then conducts a screening step within each cluster of features and a selection step among the surviving features.
arXiv Detail & Related papers (2020-06-17T07:28:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.