Analyze Additive and Interaction Effects via Collaborative Trees
- URL: http://arxiv.org/abs/2405.11477v1
- Date: Sun, 19 May 2024 08:03:13 GMT
- Title: Analyze Additive and Interaction Effects via Collaborative Trees
- Authors: Chien-Ming Chi,
- Abstract summary: We present Collaborative Trees, a novel tree model designed for regression prediction, along with its bagging version.
We decompose the mean decrease in impurity from the proposed tree model to analyze the additive and interaction effects of features on the response variable.
We show that Collaborative Trees, built upon a sum of trees'' approach with our own innovative tree model regularization, exhibit characteristics akin to matching pursuit.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present Collaborative Trees, a novel tree model designed for regression prediction, along with its bagging version, which aims to analyze complex statistical associations between features and uncover potential patterns inherent in the data. We decompose the mean decrease in impurity from the proposed tree model to analyze the additive and interaction effects of features on the response variable. Additionally, we introduce network diagrams to visually depict how each feature contributes additively to the response and how pairs of features contribute interaction effects. Through a detailed demonstration using an embryo growth dataset, we illustrate how the new statistical tools aid data analysis, both visually and numerically. Moreover, we delve into critical aspects of tree modeling, such as prediction performance, inference stability, and bias in feature importance measures, leveraging real datasets and simulation experiments for comprehensive discussions. On the theory side, we show that Collaborative Trees, built upon a ``sum of trees'' approach with our own innovative tree model regularization, exhibit characteristics akin to matching pursuit, under the assumption of high-dimensional independent binary input features (or one-hot feature groups). This newfound link sheds light on the superior capability of our tree model in estimating additive effects of features, a crucial factor for accurate interaction effect estimation.
Related papers
- Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models.
By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity.
In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z) - Feature graphs for interpretable unsupervised tree ensembles: centrality, interaction, and application in disease subtyping [0.24578723416255746]
Feature selection assumes a pivotal role in enhancing model interpretability.
The accuracy gained from aggregating decision trees comes at the expense of interpretability.
The study introduces novel methods to construct feature graphs from unsupervised random forests.
arXiv Detail & Related papers (2024-04-27T12:47:37Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Beyond TreeSHAP: Efficient Computation of Any-Order Shapley Interactions
for Tree Ensembles [6.664930499708017]
The Shapley value (SV) is a concept in explainable artificial intelligence (XAI) research for quantifying additive feature attributions of predictions.
We present TreeSHAP-IQ, an efficient method to compute any-order additive Shapley interactions for predictions tree-based models.
arXiv Detail & Related papers (2024-01-22T16:08:41Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Asymmetric feature interaction for interpreting model predictions [13.934784414106087]
In natural language processing, deep neural networks (DNNs) could model complex interactions between context.
We propose an asymmetric feature interaction attribution model that aims to explore asymmetric higher-order feature interactions.
Experimental results on two sentiment classification datasets show the superiority of our model against the state-of-the-art feature interaction attribution methods.
arXiv Detail & Related papers (2023-05-12T03:31:24Z) - Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior.
The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z) - CausalDialogue: Modeling Utterance-level Causality in Conversations [83.03604651485327]
We have compiled and expanded upon a new dataset called CausalDialogue through crowd-sourcing.
This dataset includes multiple cause-effect pairs within a directed acyclic graph (DAG) structure.
We propose a causality-enhanced method called Exponential Average Treatment Effect (ExMATE) to enhance the impact of causality at the utterance level in training neural conversation models.
arXiv Detail & Related papers (2022-12-20T18:31:50Z) - Energy Trees: Regression and Classification With Structured and
Mixed-Type Covariates [0.0]
Energy trees leverage energy statistics to extend the capabilities of conditional inference trees.
We show the model's competitive performance in terms of variable selection and robustness to overfitting.
We also assess the model's predictive ability through two empirical analyses involving human biological data.
arXiv Detail & Related papers (2022-07-10T10:41:51Z) - An Empirical Investigation of Commonsense Self-Supervision with
Knowledge Graphs [67.23285413610243]
Self-supervision based on the information extracted from large knowledge graphs has been shown to improve the generalization of language models.
We study the effect of knowledge sampling strategies and sizes that can be used to generate synthetic data for adapting language models.
arXiv Detail & Related papers (2022-05-21T19:49:04Z) - Causal Discovery in Physical Systems from Videos [123.79211190669821]
Causal discovery is at the core of human cognition.
We consider the task of causal discovery from videos in an end-to-end fashion without supervision on the ground-truth graph structure.
arXiv Detail & Related papers (2020-07-01T17:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.