Related papers: FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features

FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features

URL: http://arxiv.org/abs/2006.09693v1
Date: Wed, 17 Jun 2020 07:28:11 GMT
Title: FREEtree: A Tree-based Approach for High Dimensional Longitudinal Data With Correlated Features
Authors: Yuancheng Xu, Athanasse Zafirov, R. Michael Alvarez, Dan Kojis, Min Tan and Christina M. Ramirez
Abstract summary: FREEtree is a tree-based method for high dimensional longitudinal data with correlated features. It exploits the network structure of the features by first clustering them using weighted correlation network analysis. It then conducts a screening step within each cluster of features and a selection step among the surviving features.
Score: 2.00191482700544
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper proposes FREEtree, a tree-based method for high dimensional longitudinal data with correlated features. Popular machine learning approaches, like Random Forests, commonly used for variable selection do not perform well when there are correlated features and do not account for data observed over time. FREEtree deals with longitudinal data by using a piecewise random effects model. It also exploits the network structure of the features by first clustering them using weighted correlation network analysis, namely WGCNA. It then conducts a screening step within each cluster of features and a selection step among the surviving features, that provides a relatively unbiased way to select features. By using dominant principle components as regression variables at each leaf and the original features as splitting variables at splitting nodes, FREEtree maintains its interpretability and improves its computational efficiency. The simulation results show that FREEtree outperforms other tree-based methods in terms of prediction accuracy, feature selection accuracy, as well as the ability to recover the underlying structure.

Related papers

Learning Order Forest for Qualitative-Attribute Data Clustering [52.612779710298526]
This paper discovers a tree-like distance structure to flexibly represent the local order relationship among intra-attribute qualitative values.<n>A joint learning mechanism is proposed to iteratively obtain more appropriate tree structures and clusters.<n>Experiments demonstrate that the joint learning adapts the forest to the clustering task to yield accurate results.
arXiv Detail & Related papers (2026-03-03T07:49:50Z)
Hierarchical Quantized Diffusion Based Tree Generation Method for Hierarchical Representation and Lineage Analysis [49.00783841494125]
HDTree captures tree relationships within a hierarchical latent space using a unified hierarchical codebook and quantized diffusion processes.<n> HDTree's effectiveness is demonstrated through comparisons on both general-purpose and single-cell datasets.<n>These contributions provide a new tool for hierarchical lineage analysis, enabling more accurate and efficient modeling of cellular differentiation paths.
arXiv Detail & Related papers (2025-06-29T15:19:13Z)
Soft decision trees for survival analysis [2.1506382989223782]
We propose a new soft survival tree model (SST) with a soft splitting rule at each branch node.<n>SST and the training formulation combine flexibility with interpretability.<n>SSTs outperform three benchmark survival trees in terms of four widely-used discrimination and calibration measures.
arXiv Detail & Related papers (2025-06-20T08:51:33Z)
RO-FIGS: Efficient and Expressive Tree-Based Ensembles for Tabular Data [10.610270769561811]
Tree-based models are robust to uninformative features and can accurately capture non-smooth, complex decision boundaries. We propose Random oblique Fast Interpretable Greedy-Tree Sums (RO-FIGS) RO-FIGS builds on Fast Interpretable Greedy-Tree Sums, and extends it by learning trees with oblique or multivariate splits. We evaluate RO-FIGS on 22 real-world datasets, demonstrating superior performance and much smaller models over other tree- and neural network-based methods.
arXiv Detail & Related papers (2025-04-09T14:35:24Z)
Experiments with Optimal Model Trees [2.8391355909797644]
We show that globally optimal model trees can achieve competitive accuracy with very small trees. We also compare to classic optimal and greedily grown decision trees, random forests, and support vector machines.
arXiv Detail & Related papers (2025-03-17T08:03:47Z)
Learning Decision Trees as Amortized Structure Inference [59.65621207449269]
We propose a hybrid amortized structure inference approach to learn predictive decision tree ensembles given data. We show that our approach, DT-GFN, outperforms state-of-the-art decision tree and deep learning methods on standard classification benchmarks.
arXiv Detail & Related papers (2025-03-10T07:05:07Z)
ReTreever: Tree-based Coarse-to-Fine Representations for Retrieval [64.44265315244579]
We propose a tree-based method for organizing and representing reference documents at various granular levels. Our method, called ReTreever, jointly learns a routing function per internal node of a binary tree such that query and reference documents are assigned to similar tree branches. Our evaluations show that ReTreever generally preserves full representation accuracy.
arXiv Detail & Related papers (2025-02-11T21:35:13Z)
Optimized Feature Generation for Tabular Data via LLMs with Decision Tree Reasoning [53.241569810013836]
We propose a novel framework that utilizes large language models (LLMs) to identify effective feature generation rules. We use decision trees to convey this reasoning information, as they can be easily represented in natural language. OCTree consistently enhances the performance of various prediction models across diverse benchmarks.
arXiv Detail & Related papers (2024-06-12T08:31:34Z)
Forecasting with Hyper-Trees [50.72190208487953]
Hyper-Trees are designed to learn the parameters of time series models. By relating the parameters of a target time series model to features, Hyper-Trees also address the issue of parameter non-stationarity. In this novel approach, the trees first generate informative representations from the input features, which a shallow network then maps to the target model parameters.
arXiv Detail & Related papers (2024-05-13T15:22:15Z)
Unboxing Tree Ensembles for interpretability: a hierarchical visualization tool and a multivariate optimal re-built tree [0.34530027457862006]
We develop an interpretable representation of a tree-ensemble model that can provide valuable insights into its behavior. The proposed model is effective in yielding a shallow interpretable tree approxing the tree-ensemble decision function.
arXiv Detail & Related papers (2023-02-15T10:43:31Z)
SETAR-Tree: A Novel and Accurate Tree Algorithm for Global Time Series Forecasting [7.206754802573034]
In this paper, we explore the close connections between TAR models and regression trees. We introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves. In our evaluation, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms.
arXiv Detail & Related papers (2022-11-16T04:30:42Z)
Individualized and Global Feature Attributions for Gradient Boosted Trees in the Presence of $\ell_2$ Regularization [0.0]
We propose Prediction Decomposition (PreDecomp), a novel individualized feature attribution for boosted trees when they are trained with $ell$ regularization. We also propose TreeInner, a family of debiased global feature attributions defined in terms of the inner product between any individualized feature attribution and labels on out-sample data for each tree.
arXiv Detail & Related papers (2022-11-08T17:56:22Z)
Neural Eigenfunctions Are Structured Representation Learners [93.53445940137618]
This paper introduces a structured, adaptive-length deep representation called Neural Eigenmap. We show that, when the eigenfunction is derived from positive relations in a data augmentation setup, applying NeuralEF results in an objective function. We demonstrate using such representations as adaptive-length codes in image retrieval systems.
arXiv Detail & Related papers (2022-10-23T07:17:55Z)
TreeFlow: Going beyond Tree-based Gaussian Probabilistic Regression [0.0]
We introduce TreeFlow, the tree-based approach that combines the benefits of using tree ensembles with the capabilities of modeling flexible probability distributions. We evaluate the proposed method on challenging regression benchmarks with varying volume, feature characteristics, and target dimensionality.
arXiv Detail & Related papers (2022-06-08T20:06:23Z)
Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods [10.289846887751079]
We introduce Hierarchical Shrinkage (HS), a post-hoc algorithm that does not modify the tree structure. HS substantially increases the predictive performance of decision trees, even when used in conjunction with other regularization techniques. All code and models are released in a full-fledged package available on Github.
arXiv Detail & Related papers (2022-02-02T02:43:23Z)
Visualizing hierarchies in scRNA-seq data using a density tree-biased autoencoder [50.591267188664666]
We propose an approach for identifying a meaningful tree structure from high-dimensional scRNA-seq data. We then introduce DTAE, a tree-biased autoencoder that emphasizes the tree structure of the data in low dimensional space.
arXiv Detail & Related papers (2021-02-11T08:48:48Z)
Deep Reinforcement Learning of Graph Matching [63.469961545293756]
Graph matching (GM) under node and pairwise constraints has been a building block in areas from optimization to computer vision. We present a reinforcement learning solver for GM i.e. RGM that seeks the node correspondence between pairwise graphs. Our method differs from the previous deep graph matching model in the sense that they are focused on the front-end feature extraction and affinity function learning.
arXiv Detail & Related papers (2020-12-16T13:48:48Z)
Rethinking Learnable Tree Filter for Generic Feature Transform [71.77463476808585]
Learnable Tree Filter presents a remarkable approach to model structure-preserving relations for semantic segmentation. To relax the geometric constraint, we give the analysis by reformulating it as a Markov Random Field and introduce a learnable unary term. For semantic segmentation, we achieve leading performance (82.1% mIoU) on the Cityscapes benchmark without bells-and-whistles.
arXiv Detail & Related papers (2020-12-07T07:16:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.