Random Forests for time-fixed and time-dependent predictors: The DynForest R package
- URL: http://arxiv.org/abs/2302.02670v2
- Date: Thu, 11 Apr 2024 08:14:05 GMT
- Title: Random Forests for time-fixed and time-dependent predictors: The DynForest R package
- Authors: Anthony Devaux, Cécile Proust-Lima, Robin Genuer,
- Abstract summary: DynForest implements random forests for predicting a time-to-event outcome.
Time-dependent predictors can be endogeneous (i.e., impacted by the outcome process)
DynForest computes variable importance and minimal depth to inform on the most predictive variables or groups of variables.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The R package DynForest implements random forests for predicting a continuous, a categorical or a (multiple causes) time-to-event outcome based on time-fixed and time-dependent predictors. The main originality of DynForest is that it handles time-dependent predictors that can be endogeneous (i.e., impacted by the outcome process), measured with error and measured at subject-specific times. At each recursive step of the tree building process, the time-dependent predictors are internally summarized into individual features on which the split can be done. This is achieved using flexible linear mixed models (thanks to the R package lcmm) which specification is pre-specified by the user. DynForest returns the mean for continuous outcome, the category with a majority vote for categorical outcome or the cumulative incidence function over time for survival outcome. DynForest also computes variable importance and minimal depth to inform on the most predictive variables or groups of variables. This paper aims to guide the user with step-by-step examples for fitting random forests using DynForest.
Related papers
- Timer-XL: Long-Context Transformers for Unified Time Series Forecasting [67.83502953961505]
We present Timer-XL, a causal Transformer for unified time series forecasting.
Based on large-scale pre-training, Timer-XL achieves state-of-the-art zero-shot performance.
arXiv Detail & Related papers (2024-10-07T07:27:39Z) - missForestPredict -- Missing data imputation for prediction settings [2.8461446020965435]
missForestPredict is a fast and user-friendly adaptation of the missForest imputation algorithm.
missForestPredict offers extended error monitoring and control over variables used in the imputation.
missForestPredict provides competitive results in prediction settings within short computation times.
arXiv Detail & Related papers (2024-07-02T17:45:46Z) - Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - Variational Deep Survival Machines: Survival Regression with Censored Outcomes [11.82370259688716]
Survival regression aims to predict the time when an event of interest will take place, typically a death or a failure.
We present a novel method to predict the survival time by better clustering the survival data and combine primitive distributions.
arXiv Detail & Related papers (2024-04-24T02:16:00Z) - TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables [75.83318701911274]
TimeXer ingests external information to enhance the forecasting of endogenous variables.
TimeXer achieves consistent state-of-the-art performance on twelve real-world forecasting benchmarks.
arXiv Detail & Related papers (2024-02-29T11:54:35Z) - Random survival forests for competing risks with multivariate
longitudinal endogenous covariates [0.0]
We propose an innovative solution to predict an event probability using a possibly large number of longitudinal predictors.
DynForest is an extension of random survival forests for competing risks that handles endogenous longitudinal predictors.
arXiv Detail & Related papers (2022-08-11T12:58:11Z) - TACTiS: Transformer-Attentional Copulas for Time Series [76.71406465526454]
estimation of time-varying quantities is a fundamental component of decision making in fields such as healthcare and finance.
We propose a versatile method that estimates joint distributions using an attention-based decoder.
We show that our model produces state-of-the-art predictions on several real-world datasets.
arXiv Detail & Related papers (2022-02-07T21:37:29Z) - RFpredInterval: An R Package for Prediction Intervals with Random
Forests and Boosted Forests [0.0]
We have developed a comprehensive R package, RFpredInterval, that integrates 16 methods to build prediction intervals with random forests and boosted forests.
The methods implemented in the package are a new method to build prediction intervals with boosted forests (PIBF) and 15 different variants to produce prediction intervals with random forests proposed by Roy and Larocque ( 2020)
The results show that the proposed method is very competitive and, globally, it outperforms the competing methods.
arXiv Detail & Related papers (2021-06-15T15:27:50Z) - Multivariate Probabilistic Regression with Natural Gradient Boosting [63.58097881421937]
We propose a Natural Gradient Boosting (NGBoost) approach based on nonparametrically modeling the conditional parameters of the multivariate predictive distribution.
Our method is robust, works out-of-the-box without extensive tuning, is modular with respect to the assumed target distribution, and performs competitively in comparison to existing approaches.
arXiv Detail & Related papers (2021-06-07T17:44:49Z) - Improving Event Duration Prediction via Time-aware Pre-training [90.74988936678723]
We introduce two effective models for duration prediction.
One model predicts the range/unit where the duration value falls in (R-pred); and the other predicts the exact duration value E-pred.
Our best model -- E-pred, substantially outperforms previous work, and captures duration information more accurately than R-pred.
arXiv Detail & Related papers (2020-11-05T01:52:11Z) - Time-series Imputation and Prediction with Bi-Directional Generative
Adversarial Networks [0.3162999570707049]
We present a model for the combined task of imputing and predicting values for irregularly observed and varying length time-series data with missing entries.
Our model learns how to impute missing elements in-between (imputation) or outside of the input time steps (prediction), hence working as an effective any-time prediction tool for time-series data.
arXiv Detail & Related papers (2020-09-18T15:47:51Z) - On the Discrepancy between Density Estimation and Sequence Generation [92.70116082182076]
log-likelihood is highly correlated with BLEU when we consider models within the same family.
We observe no correlation between rankings of models across different families.
arXiv Detail & Related papers (2020-02-17T20:13:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.