Fast Interpretable Greedy-Tree Sums
- URL: http://arxiv.org/abs/2201.11931v3
- Date: Sat, 8 Jul 2023 16:18:03 GMT
- Title: Fast Interpretable Greedy-Tree Sums
- Authors: Yan Shuo Tan, Chandan Singh, Keyan Nasseri, Abhineet Agarwal, James
Duncan, Omer Ronen, Matthew Epland, Aaron Kornblith, Bin Yu
- Abstract summary: Fast Interpretable Greedy-Tree Sums (FIGS) generalizes the CART algorithm to grow a flexible number of trees in summation.
G-FIGS derives CDIs that reflect domain knowledge and enjoy improved specificity (by up to 20% over CART) without sacrificing sensitivity or interpretability.
Bagging-FIGS enjoys competitive performance with random forests and XGBoost on real-world datasets.
- Score: 8.268938983372452
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern machine learning has achieved impressive prediction performance, but
often sacrifices interpretability, a critical consideration in high-stakes
domains such as medicine. In such settings, practitioners often use highly
interpretable decision tree models, but these suffer from inductive bias
against additive structure. To overcome this bias, we propose Fast
Interpretable Greedy-Tree Sums (FIGS), which generalizes the CART algorithm to
simultaneously grow a flexible number of trees in summation. By combining
logical rules with addition, FIGS is able to adapt to additive structure while
remaining highly interpretable. Extensive experiments on real-world datasets
show that FIGS achieves state-of-the-art prediction performance. To demonstrate
the usefulness of FIGS in high-stakes domains, we adapt FIGS to learn clinical
decision instruments (CDIs), which are tools for guiding clinical
decision-making. Specifically, we introduce a variant of FIGS known as G-FIGS
that accounts for the heterogeneity in medical data. G-FIGS derives CDIs that
reflect domain knowledge and enjoy improved specificity (by up to 20% over
CART) without sacrificing sensitivity or interpretability. To provide further
insight into FIGS, we prove that FIGS learns components of additive models, a
property we refer to as disentanglement. Further, we show (under oracle
conditions) that unconstrained tree-sum models leverage disentanglement to
generalize more efficiently than single decision tree models when fitted to
additive regression functions. Finally, to avoid overfitting with an
unconstrained number of splits, we develop Bagging-FIGS, an ensemble version of
FIGS that borrows the variance reduction techniques of random forests.
Bagging-FIGS enjoys competitive performance with random forests and XGBoost on
real-world datasets.
Related papers
- From GNNs to Trees: Multi-Granular Interpretability for Graph Neural Networks [29.032055397116217]
Interpretable Graph Neural Networks (GNNs) aim to reveal the underlying reasoning behind model predictions.
Existing subgraph-based interpretable methods suffer from an overemphasis on local structure.
We introduce a novel Tree-like Interpretable Framework (TIF) for graph classification.
arXiv Detail & Related papers (2025-05-01T07:22:51Z) - Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data.
We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z) - Inherently Interpretable Tree Ensemble Learning [7.868733904112288]
We show that when shallow decision trees are used as base learners, the ensemble learning algorithms can become inherently interpretable.
An interpretation algorithm is developed that converts the tree ensemble into the functional ANOVA representation with inherent interpretability.
Experiments on simulations and real-world datasets show that our proposed methods offer a better trade-off between model interpretation and predictive performance.
arXiv Detail & Related papers (2024-10-24T18:58:41Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior.
The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z) - Explainable Sparse Knowledge Graph Completion via High-order Graph
Reasoning Network [111.67744771462873]
This paper proposes a novel explainable model for sparse Knowledge Graphs (KGs)
It combines high-order reasoning into a graph convolutional network, namely HoGRN.
It can not only improve the generalization ability to mitigate the information insufficiency issue but also provide interpretability.
arXiv Detail & Related papers (2022-07-14T10:16:56Z) - Group Probability-Weighted Tree Sums for Interpretable Modeling of
Heterogeneous Data [9.99624617629557]
Group Probability-Weighted Tree Sums (G-FIGS) achieves state-of-the-art prediction performance on important clinical datasets.
G-FIGS increases specificity for identifying cervical spine injury by up to 10% over CART and up to 3% over FIGS alone.
All code, data, and models are released on Github.
arXiv Detail & Related papers (2022-05-30T14:27:19Z) - Optimal Decision Diagrams for Classification [68.72078059880018]
We study the training of optimal decision diagrams from a mathematical programming perspective.
We introduce a novel mixed-integer linear programming model for training.
We show how this model can be easily extended for fairness, parsimony, and stability notions.
arXiv Detail & Related papers (2022-05-28T18:31:23Z) - BCD Nets: Scalable Variational Approaches for Bayesian Causal Discovery [97.79015388276483]
A structural equation model (SEM) is an effective framework to reason over causal relationships represented via a directed acyclic graph (DAG)
Recent advances enabled effective maximum-likelihood point estimation of DAGs from observational data.
We propose BCD Nets, a variational framework for estimating a distribution over DAGs characterizing a linear-Gaussian SEM.
arXiv Detail & Related papers (2021-12-06T03:35:21Z) - Generalizing Graph Neural Networks on Out-Of-Distribution Graphs [51.33152272781324]
Graph Neural Networks (GNNs) are proposed without considering the distribution shifts between training and testing graphs.
In such a setting, GNNs tend to exploit subtle statistical correlations existing in the training set for predictions, even though it is a spurious correlation.
We propose a general causal representation framework, called StableGNN, to eliminate the impact of spurious correlations.
arXiv Detail & Related papers (2021-11-20T18:57:18Z) - A cautionary tale on fitting decision trees to data from additive
models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models.
This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data.
We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z) - Treeging [0.0]
Treeging combines the flexible mean structure of regression trees with the covariance-based prediction strategy of kriging into the base learner of an ensemble prediction algorithm.
We investigate the predictive accuracy of treeging across a thorough and widely varied battery of spatial and space-time simulation scenarios.
arXiv Detail & Related papers (2021-10-03T17:48:18Z) - Learning compositional structures for semantic graph parsing [81.41592892863979]
We show how AM dependency parsing can be trained directly on a neural latent-variable model.
Our model picks up on several linguistic phenomena on its own and achieves comparable accuracy to supervised training.
arXiv Detail & Related papers (2021-06-08T14:20:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.