FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data
With Correlated Features
- URL: http://arxiv.org/abs/2006.09693v1
- Date: Wed, 17 Jun 2020 07:28:11 GMT
- Title: FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data
With Correlated Features
- Authors: Yuancheng Xu, Athanasse Zafirov, R. Michael Alvarez, Dan Kojis, Min
Tan and Christina M. Ramirez
- Abstract summary: FREEtree is a tree-based method for high dimensional longitudinal data with correlated features.
It exploits the network structure of the features by first clustering them using weighted correlation network analysis.
It then conducts a screening step within each cluster of features and a selection step among the surviving features.
- Score: 2.00191482700544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes FREEtree, a tree-based method for high dimensional
longitudinal data with correlated features. Popular machine learning
approaches, like Random Forests, commonly used for variable selection do not
perform well when there are correlated features and do not account for data
observed over time. FREEtree deals with longitudinal data by using a piecewise
random effects model. It also exploits the network structure of the features by
first clustering them using weighted correlation network analysis, namely
WGCNA. It then conducts a screening step within each cluster of features and a
selection step among the surviving features, that provides a relatively
unbiased way to select features. By using dominant principle components as
regression variables at each leaf and the original features as splitting
variables at splitting nodes, FREEtree maintains its interpretability and
improves its computational efficiency. The simulation results show that
FREEtree outperforms other tree-based methods in terms of prediction accuracy,
feature selection accuracy, as well as the ability to recover the underlying
structure.
Related papers
- Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a new framework based on large language models (LLMs) and decision Tree reasoning (OCTree)
Our key idea is to leverage LLMs' reasoning capabilities to find good feature generation rules without manually specifying the search space.
Our empirical results demonstrate that this simple framework consistently enhances the performance of various prediction models.
arXiv Detail & Related papers (2024-06-12T08:31:34Z) - Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models.
By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity.
In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z) - Unboxing Tree Ensembles for interpretability: a hierarchical
visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior.
The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z) - SETAR-Tree: A Novel and Accurate Tree Algorithm for Global Time Series
Forecasting [7.206754802573034]
In this paper, we explore the close connections between TAR models and regression trees.
We introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves.
In our evaluation, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms.
arXiv Detail & Related papers (2022-11-16T04:30:42Z) - Individualized and Global Feature Attributions for Gradient Boosted
Trees in the Presence of $\ell_2$ Regularization [0.0]
We propose Prediction Decomposition (PreDecomp), a novel individualized feature attribution for boosted trees when they are trained with $ell$ regularization.
We also propose TreeInner, a family of debiased global feature attributions defined in terms of the inner product between any individualized feature attribution and labels on out-sample data for each tree.
arXiv Detail & Related papers (2022-11-08T17:56:22Z) - Neural Eigenfunctions Are Structured Representation Learners [93.53445940137618]
This paper introduces a structured, adaptive-length deep representation called Neural Eigenmap.
We show that, when the eigenfunction is derived from positive relations in a data augmentation setup, applying NeuralEF results in an objective function.
We demonstrate using such representations as adaptive-length codes in image retrieval systems.
arXiv Detail & Related papers (2022-10-23T07:17:55Z) - TreeFlow: Going beyond Tree-based Gaussian Probabilistic Regression [0.0]
We introduce TreeFlow, the tree-based approach that combines the benefits of using tree ensembles with the capabilities of modeling flexible probability distributions.
We evaluate the proposed method on challenging regression benchmarks with varying volume, feature characteristics, and target dimensionality.
arXiv Detail & Related papers (2022-06-08T20:06:23Z) - Hierarchical Shrinkage: improving the accuracy and interpretability of
tree-based methods [10.289846887751079]
We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure.
HS substantially increases the predictive performance of decision trees, even when used in conjunction with other regularization techniques.
All code and models are released in a full-fledged package available on Github.
arXiv Detail & Related papers (2022-02-02T02:43:23Z) - Visualizing hierarchies in scRNA-seq data using a density tree-biased
autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data.
We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z) - Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision.
We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs.
Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z) - Rethinking Learnable Tree Filter for Generic Feature Transform [71.77463476808585]
Learnable Tree Filter presents a remarkable approach to model structure-preserving relations for semantic segmentation.
To relax the geometric constraint, we give the analysis by reformulating it as a Markov Random Field and introduce a learnable unary term.
For semantic segmentation, we achieve leading performance (82.1% mIoU) on the Cityscapes benchmark without bells-and-whistles.
arXiv Detail & Related papers (2020-12-07T07:16:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.