Related papers: Online learning techniques for prediction of temporal tabular datasets with regime changes

Online learning techniques for prediction of temporal tabular datasets with regime changes

URL: http://arxiv.org/abs/2301.00790v4
Date: Thu, 10 Aug 2023 14:26:00 GMT
Title: Online learning techniques for prediction of temporal tabular datasets with regime changes
Authors: Thomas Wong and Mauricio Barahona
Abstract summary: We propose a modular machine learning pipeline for ranking predictions on temporal panel datasets. The modularity of the pipeline allows the use of different models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks. Online learning techniques, which require no retraining of models, can be used post-prediction to enhance the results.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The application of deep learning to non-stationary temporal datasets can lead to overfitted models that underperform under regime changes. In this work, we propose a modular machine learning pipeline for ranking predictions on temporal panel datasets which is robust under regime changes. The modularity of the pipeline allows the use of different models, including Gradient Boosting Decision Trees (GBDTs) and Neural Networks, with and without feature engineering. We evaluate our framework on financial data for stock portfolio prediction, and find that GBDT models with dropout display high performance, robustness and generalisability with reduced complexity and computational cost. We then demonstrate how online learning techniques, which require no retraining of models, can be used post-prediction to enhance the results. First, we show that dynamic feature projection improves robustness by reducing drawdown in regime changes. Second, we demonstrate that dynamical model ensembling based on selection of models with good recent performance leads to improved Sharpe and Calmar ratios of out-of-sample predictions. We also evaluate the robustness of our pipeline across different data splits and random seeds with good reproducibility.

Related papers

Exploring Patterns Behind Sports [3.2838877620203935]
This paper presents a comprehensive framework for time series prediction using a hybrid model that combines ARIMA and LSTM. The model incorporates feature engineering techniques, including embedding and PCA, to transform raw data into a lower-dimensional representation.
arXiv Detail & Related papers (2025-02-11T11:51:07Z)
Supervised Score-Based Modeling by Gradient Boosting [49.556736252628745]
We propose a Supervised Score-based Model (SSM) which can be viewed as a gradient boosting algorithm combining score matching. We provide a theoretical analysis of learning and sampling for SSM to balance inference time and prediction accuracy. Our model outperforms existing models in both accuracy and inference time.
arXiv Detail & Related papers (2024-11-02T07:06:53Z)
Graph-enabled Reinforcement Learning for Time Series Forecasting with Adaptive Intelligence [11.249626785206003]
We propose a novel approach for predicting time-series data using Graphical neural network (GNN) and monitoring with Reinforcement Learning (RL) GNNs are able to explicitly incorporate the graph structure of the data into the model, allowing them to capture temporal dependencies in a more natural way. This approach allows for more accurate predictions in complex temporal structures, such as those found in healthcare, traffic and weather forecasting.
arXiv Detail & Related papers (2023-09-18T22:25:12Z)
SHAPNN: Shapley Value Regularized Tabular Neural Network [4.587122314291091]
We present SHAPNN, a novel deep data modeling architecture designed for supervised learning. Our neural network is trained using standard backward propagation optimization methods, and is regularized with realtime estimated Shapley values. We evaluate our method on various publicly available datasets and compare it with state-of-the-art deep neural network models.
arXiv Detail & Related papers (2023-09-15T22:45:05Z)
Kalman Filter for Online Classification of Non-Stationary Data [101.26838049872651]
In Online Continual Learning (OCL) a learning system receives a stream of data and sequentially performs prediction and training steps. We introduce a probabilistic Bayesian online learning model by using a neural representation and a state space model over the linear predictor weights. In experiments in multi-class classification we demonstrate the predictive ability of the model and its flexibility to capture non-stationarity.
arXiv Detail & Related papers (2023-06-14T11:41:42Z)
Transfer Learning in Deep Learning Models for Building Load Forecasting: Case of Limited Data [0.0]
This paper proposes a Building-to-Building Transfer Learning framework to overcome the problem and enhance the performance of Deep Learning models. The proposed approach improved the forecasting accuracy by 56.8% compared to the case of conventional deep learning where training from scratch is used.
arXiv Detail & Related papers (2023-01-25T16:05:47Z)
Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers. We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z)
Back2Future: Leveraging Backfill Dynamics for Improving Real-time Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task. 'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature. We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z)
Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn. We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z)
Anomaly Detection of Time Series with Smoothness-Inducing Sequential Variational Auto-Encoder [59.69303945834122]
We present a Smoothness-Inducing Sequential Variational Auto-Encoder (SISVAE) model for robust estimation and anomaly detection of time series. Our model parameterizes mean and variance for each time-stamp with flexible neural networks. We show the effectiveness of our model on both synthetic datasets and public real-world benchmarks.
arXiv Detail & Related papers (2021-02-02T06:15:15Z)
Reinforcement Learning based dynamic weighing of Ensemble Models for Time Series Forecasting [0.8399688944263843]
It is known that if models selected for data modelling are distinct (linear/non-linear, static/dynamic) and independent (minimally correlated) models, the accuracy of the predictions is improved. Various approaches suggested in the literature to weigh the ensemble models use a static set of weights. To address this issue, a Reinforcement Learning (RL) approach to dynamically assign and update weights of each of the models at different time instants.
arXiv Detail & Related papers (2020-08-20T10:40:42Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
Online Tensor-Based Learning for Multi-Way Data [1.0953917735844645]
A new efficient tensor-based feature extraction, named NeSGD, is proposed for online $CANDECOMP/PARAFAC$ decomposition. Results show that the proposed methods significantly improved the classification error rates, were able to assimilate the changes in the positive data distribution over time, and maintained a high predictive accuracy in all case studies.
arXiv Detail & Related papers (2020-03-10T02:04:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.