Related papers: StockBot 2.0: Vanilla LSTMs Outperform Transformer-based Forecasting for Stock Prices

StockBot 2.0: Vanilla LSTMs Outperform Transformer-based Forecasting for Stock Prices

URL: http://arxiv.org/abs/2601.00197v1
Date: Thu, 01 Jan 2026 04:09:51 GMT
Title: StockBot 2.0: Vanilla LSTMs Outperform Transformer-based Forecasting for Stock Prices
Authors: Shaswat Mohanty,
Abstract summary: We present an enhanced StockBot architecture that systematically evaluates modern attention-based, convolutional, and recurrent time-series forecasting models.<n>A carefully constructed vanilla LSTM consistently achieves superior predictive accuracy and more stable buy/sell decision-making.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurate forecasting of financial markets remains a long-standing challenge due to complex temporal and often latent dependencies, non-linear dynamics, and high volatility. Building on our earlier recurrent neural network framework, we present an enhanced StockBot architecture that systematically evaluates modern attention-based, convolutional, and recurrent time-series forecasting models within a unified experimental setting. While attention-based and transformer-inspired models offer increased modeling flexibility, extensive empirical evaluation reveals that a carefully constructed vanilla LSTM consistently achieves superior predictive accuracy and more stable buy/sell decision-making when trained under a common set of default hyperparameters. These results highlight the robustness and data efficiency of recurrent sequence models for financial time-series forecasting, particularly in the absence of extensive hyperparameter tuning or the availability of sufficient data when discretized to single-day intervals. Additionally, these results underscore the importance of architectural inductive bias in data-limited market prediction tasks.

Related papers

Financial time series augmentation using transformer based GAN architecture [0.0]
We show how Generative Adrial Networks (GANs) can effectively serve as a data augmentation tool to overcome data scarcity in the financial domain.<n>Specifically, we show that training a Long Short-Term Memory (LSTM) forecasting model on a dataset augmented with synthetic data generated by a transformer-based GAN (TTS-GAN) significantly improves the forecasting accuracy compared to using real data alone.
arXiv Detail & Related papers (2026-02-19T22:02:09Z)
Robust Probabilistic Load Forecasting for a Single Household: A Comparative Study from SARIMA to Transformers on the REFIT Dataset [0.0]
This paper tackles the challenge using the volatile REFIT household dataset.<n>We first address this by conducting a rigorous comparative experiment to select a Seasonal Imputation method.<n>We then systematically evaluate a hierarchy of models, progressing from classical baselines to machine learning.<n>Our findings reveal that classical models fail to capture the data's non-linear, regime-switching behavior.
arXiv Detail & Related papers (2025-11-30T12:05:18Z)
A Unified Frequency Domain Decomposition Framework for Interpretable and Robust Time Series Forecasting [81.73338008264115]
Current approaches for time series forecasting, whether in the time or frequency domain, predominantly use deep learning models based on linear layers or transformers.<n>We propose FIRE, a unified frequency domain decomposition framework that provides a mathematical abstraction for diverse types of time series.<n>Fire consistently outperforms state-of-the-art models on long-term forecasting benchmarks.
arXiv Detail & Related papers (2025-10-11T09:59:25Z)
Dynamic Lagging for Time-Series Forecasting in E-Commerce Finance: Mitigating Information Loss with A Hybrid ML Architecture [0.8192992814374568]
We propose a hybrid forecasting framework that integrates dynamic lagged feature engineering and adaptive rolling-window representations.<n>Our approach explicitly incorporates invoice-level behavioral modeling, structured lag of support data, and custom stability-aware loss functions.
arXiv Detail & Related papers (2025-09-24T15:33:16Z)
Enhancing Transformer-Based Foundation Models for Time Series Forecasting via Bagging, Boosting and Statistical Ensembles [7.787518725874443]
Time series foundation models (TSFMs) have shown strong generalization and zero-shot capabilities for time series forecasting, anomaly detection, classification, and imputation.<n>This paper investigates a suite of statistical and ensemble-based enhancement techniques to improve robustness and accuracy.
arXiv Detail & Related papers (2025-08-18T04:06:26Z)
Advanced Risk Prediction and Stability Assessment of Banks Using Time Series Transformer Models [10.79035001851989]
This paper proposes a prediction framework based on the Time Series Transformer model.<n>We compare the model with LSTM, GRU, CNN, TCN and RNN-Transformer models.<n>The experimental results show that the Time Series Transformer model outperforms other models in both mean square error (MSE) and mean absolute error (MAE) evaluation indicators.
arXiv Detail & Related papers (2024-12-04T08:15:27Z)
Parsimony or Capability? Decomposition Delivers Both in Long-term Time Series Forecasting [46.63798583414426]
Long-term time series forecasting (LTSF) represents a critical frontier in time series analysis. Our study demonstrates, through both analytical and empirical evidence, that decomposition is key to containing excessive model inflation. Remarkably, by tailoring decomposition to the intrinsic dynamics of time series data, our proposed model outperforms existing benchmarks.
arXiv Detail & Related papers (2024-01-22T13:15:40Z)
Towards Long-Term Time-Series Forecasting: Feature, Pattern, and Distribution [57.71199089609161]
Long-term time-series forecasting (LTTF) has become a pressing demand in many applications, such as wind power supply planning. Transformer models have been adopted to deliver high prediction capacity because of the high computational self-attention mechanism. We propose an efficient Transformerbased model, named Conformer, which differentiates itself from existing methods for LTTF in three aspects.
arXiv Detail & Related papers (2023-01-05T13:59:29Z)
DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions [53.37679435230207]
We propose DeepVol, a model based on Dilated Causal Convolutions that uses high-frequency data to forecast day-ahead volatility. Our empirical results suggest that the proposed deep learning-based approach effectively learns global features from high-frequency data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z)
Low-Rank Temporal Attention-Augmented Bilinear Network for financial time-series forecasting [93.73198973454944]
Deep learning models have led to significant performance improvements in many problems coming from different domains, including prediction problems of financial time-series data. The Temporal Attention-Augmented Bilinear network was recently proposed as an efficient and high-performing model for Limit Order Book time-series forecasting. In this paper, we propose a low-rank tensor approximation of the model to further reduce the number of trainable parameters and increase its speed.
arXiv Detail & Related papers (2021-07-05T10:15:23Z)
Stochastically forced ensemble dynamic mode decomposition for forecasting and analysis of near-periodic systems [65.44033635330604]
We introduce a novel load forecasting method in which observed dynamics are modeled as a forced linear system. We show that its use of intrinsic linear dynamics offers a number of desirable properties in terms of interpretability and parsimony. Results are presented for a test case using load data from an electrical grid.
arXiv Detail & Related papers (2020-10-08T20:25:52Z)
Transformer Hawkes Process [79.16290557505211]
We propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long-term dependencies. THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin. We provide a concrete example, where THP achieves improved prediction performance for learning multiple point processes when incorporating their relational information.
arXiv Detail & Related papers (2020-02-21T13:48:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.