Factor-augmented tree ensembles
- URL: http://arxiv.org/abs/2111.14000v6
- Date: Mon, 12 Jun 2023 19:37:37 GMT
- Title: Factor-augmented tree ensembles
- Authors: Filippo Pellegrino
- Abstract summary: This manuscript proposes to extend the information set of time-series regression trees with latent stationary factors extracted via state-space methods.
It allows to handle predictors that exhibit measurement error, non-stationary trends, seasonality and/or irregularities such as missing observations.
Empirically, ensembles of these factor-augmented trees provide a reliable approach for macro-finance problems.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This manuscript proposes to extend the information set of time-series
regression trees with latent stationary factors extracted via state-space
methods. In doing so, this approach generalises time-series regression trees on
two dimensions. First, it allows to handle predictors that exhibit measurement
error, non-stationary trends, seasonality and/or irregularities such as missing
observations. Second, it gives a transparent way for using domain-specific
theory to inform time-series regression trees. Empirically, ensembles of these
factor-augmented trees provide a reliable approach for macro-finance problems.
This article highlights it focussing on the lead-lag effect between equity
volatility and the business cycle in the United States.
Related papers
- Ensembles of Probabilistic Regression Trees [46.53457774230618]
Tree-based ensemble methods have been successfully used for regression problems in many applications and research studies.
We study ensemble versions of probabilisticregression trees that provide smooth approximations of the objective function by assigningeach observation to each region with respect to a probability distribution.
arXiv Detail & Related papers (2024-06-20T06:51:51Z) - A Study of Posterior Stability for Time-Series Latent Diffusion [65.95306174480034]
We first explain that posterior collapse reduces latent diffusion to a VAE, making it less expressive.
We introduce the notion of dependency measures, showing that the latent variable sampled from the diffusion model loses control of the generation process.
We also analyze the causes of posterior collapse and introduce a new framework based on this analysis, which addresses the problem and supports a more expressive prior distribution.
arXiv Detail & Related papers (2024-05-22T21:54:12Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse
Actions, Interventions and Sparse Temporal Dependencies [58.179981892921056]
This work introduces a novel principle for disentanglement we call mechanism sparsity regularization.
We propose a representation learning method that induces disentanglement by simultaneously learning the latent factors.
We show that the latent factors can be recovered by regularizing the learned causal graph to be sparse.
arXiv Detail & Related papers (2024-01-10T02:38:21Z) - A U-turn on Double Descent: Rethinking Parameter Counting in Statistical
Learning [68.76846801719095]
We show that double descent appears exactly when and where it occurs, and that its location is not inherently tied to the threshold p=n.
This provides a resolution to tensions between double descent and statistical intuition.
arXiv Detail & Related papers (2023-10-29T12:05:39Z) - Prediction Algorithms Achieving Bayesian Decision Theoretical Optimality
Based on Decision Trees as Data Observation Processes [1.2774526936067927]
This paper uses trees to represent data observation processes behind given data.
We derive the statistically optimal prediction, which is robust against overfitting.
We solve this by a Markov chain Monte Carlo method, whose step size is adaptively tuned according to a posterior distribution for the trees.
arXiv Detail & Related papers (2023-06-12T12:14:57Z) - SETAR-Tree: A Novel and Accurate Tree Algorithm for Global Time Series
Forecasting [7.206754802573034]
In this paper, we explore the close connections between TAR models and regression trees.
We introduce a new forecasting-specific tree algorithm that trains global Pooled Regression (PR) models in the leaves.
In our evaluation, the proposed tree and forest models are able to achieve significantly higher accuracy than a set of state-of-the-art tree-based algorithms.
arXiv Detail & Related papers (2022-11-16T04:30:42Z) - Lassoed Tree Boosting [53.56229983630983]
We prove that a gradient boosted tree algorithm with early stopping faster than $n-1/4$ L2 convergence in the large nonparametric space of cadlag functions of bounded sectional variation.
Our convergence proofs are based on a novel, general theorem on early stopping with empirical loss minimizers of nested Donsker classes.
arXiv Detail & Related papers (2022-05-22T00:34:41Z) - Data-driven advice for interpreting local and global model predictions
in bioinformatics problems [17.685881417954782]
Conditional feature contributions (CFCs) provide textitlocal, case-by-case explanations of a prediction.
We compare the explanations computed by both methods on a set of 164 publicly available classification problems.
For random forests, we find extremely high similarities and correlations of both local and global SHAP values and CFC scores.
arXiv Detail & Related papers (2021-08-13T12:41:39Z) - Large Scale Prediction with Decision Trees [9.917147243076645]
This paper shows that decision trees constructed with Classification and Regression Trees (CART) and C4.5 methodology are consistent for regression and classification tasks.
A key step in the analysis is the establishment of an oracle inequality, which allows for a precise characterization of the goodness-of-fit and complexity tradeoff for a mis-specified model.
arXiv Detail & Related papers (2021-04-28T16:59:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.