Multimodal Stock Price Prediction: A Case Study of the Russian Securities Market
- URL: http://arxiv.org/abs/2503.08696v1
- Date: Wed, 05 Mar 2025 21:20:32 GMT
- Title: Multimodal Stock Price Prediction: A Case Study of the Russian Securities Market
- Authors: Kasymkhan Khubiev, Mikhail Semenov,
- Abstract summary: This paper addresses the problem of forecasting financial asset prices using the multimodal approach that combines candlestick time series and news flow data.<n>A unique dataset was collected, which includes time series for 176 Russian stocks traded on the Moscow Exchange and 79,555 financial news articles in Russian.<n>Experiments showed that incorporating textual modality reduced the MAPE value by 55%.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Classical asset price forecasting methods primarily rely on numerical data, such as price time series, trading volumes, limit order book data, and technical analysis indicators. However, the news flow plays a significant role in price formation, making the development of multimodal approaches that combine textual and numerical data for improved prediction accuracy highly relevant. This paper addresses the problem of forecasting financial asset prices using the multimodal approach that combines candlestick time series and textual news flow data. A unique dataset was collected for the study, which includes time series for 176 Russian stocks traded on the Moscow Exchange and 79,555 financial news articles in Russian. For processing textual data, pre-trained models RuBERT and Vikhr-Qwen2.5-0.5b-Instruct (a large language model) were used, while time series and vectorized text data were processed using an LSTM recurrent neural network. The experiments compared models based on a single modality (time series only) and two modalities, as well as various methods for aggregating text vector representations. Prediction quality was estimated using two key metrics: Accuracy (direction of price movement prediction: up or down) and Mean Absolute Percentage Error (MAPE), which measures the deviation of the predicted price from the true price. The experiments showed that incorporating textual modality reduced the MAPE value by 55%. The resulting multimodal dataset holds value for the further adaptation of language models in the financial sector. Future research directions include optimizing textual modality parameters, such as the time window, sentiment, and chronological order of news messages.
Related papers
- Forecasting Cryptocurrency Prices using Contextual ES-adRNN with Exogenous Variables [3.0108936184913295]
We introduce a new approach to multivariate forecasting cryptocurrency prices using a hybrid contextual model combining exponential smoothing (ES) and recurrent neural network (RNN)
The model generates both point daily forecasts and predictive intervals for one-day, one-week and four-week horizons.
We apply our model to forecast prices of 15 cryptocurrencies based on 17 input variables and compare its performance with that of comparative models, including both statistical and ML ones.
arXiv Detail & Related papers (2025-04-11T20:00:03Z) - FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting [58.70072722290475]
Financial time series (FinTS) record the behavior of human-brain-augmented decision-making.<n>FinTSB is a comprehensive and practical benchmark for financial time series forecasting.
arXiv Detail & Related papers (2025-02-26T05:19:16Z) - TimeCAP: Learning to Contextualize, Augment, and Predict Time Series Events with Large Language Model Agents [52.13094810313054]
TimeCAP is a time-series processing framework that creatively employs Large Language Models (LLMs) as contextualizers of time series data.<n>TimeCAP incorporates two independent LLM agents: one generates a textual summary capturing the context of the time series, while the other uses this enriched summary to make more informed predictions.<n> Experimental results on real-world datasets demonstrate that TimeCAP outperforms state-of-the-art methods for time series event prediction.
arXiv Detail & Related papers (2025-02-17T04:17:27Z) - Multimodal Stock Price Prediction [0.0]
It has become increasingly critical to carefully integrate diverse data sources with machine learning for accurate stock price prediction.<n>This paper explores a multimodal machine learning approach for stock price prediction by combining data from diverse sources, including traditional financial metrics, tweets, and news articles.
arXiv Detail & Related papers (2025-01-23T16:38:46Z) - Context is Key: A Benchmark for Forecasting with Essential Textual Information [87.3175915185287]
"Context is Key" (CiK) is a forecasting benchmark that pairs numerical data with diverse types of carefully crafted textual context.<n>We evaluate a range of approaches, including statistical models, time series foundation models, and LLM-based forecasters.<n>We propose a simple yet effective LLM prompting method that outperforms all other tested methods on our benchmark.
arXiv Detail & Related papers (2024-10-24T17:56:08Z) - StockTime: A Time Series Specialized Large Language Model Architecture for Stock Price Prediction [13.52020491768311]
We introduce StockTime, a novel LLM-based architecture designed specifically for stock price time series data.
Unlike recent FinLLMs, StockTime is specifically designed for stock price time series data.
By fusing this multimodal data, StockTime effectively predicts stock prices across arbitrary look-back periods.
arXiv Detail & Related papers (2024-08-25T00:50:33Z) - Text2TimeSeries: Enhancing Financial Forecasting through Time Series Prediction Updates with Event-Driven Insights from Large Language Models [9.991327369572819]
We propose a collaborative modeling framework that incorporates textual information about relevant events for predictions.
We leverage the intuition of large language models about future changes to update real number time series predictions.
arXiv Detail & Related papers (2024-07-04T07:21:38Z) - Natural Language Processing and Multimodal Stock Price Prediction [0.8702432681310401]
This paper utilizes stock percentage change as training data, in contrast to the traditional use of raw currency values.
The choice of percentage change aims to provide models with context regarding the significance of price fluctuations.
The study employs specialized BERT natural language processing models to predict stock price trends.
arXiv Detail & Related papers (2024-01-03T01:21:30Z) - Contrastive Difference Predictive Coding [79.74052624853303]
We introduce a temporal difference version of contrastive predictive coding that stitches together pieces of different time series data to decrease the amount of data required to learn predictions of future events.
We apply this representation learning method to derive an off-policy algorithm for goal-conditioned RL.
arXiv Detail & Related papers (2023-10-31T03:16:32Z) - Diffusion Variational Autoencoder for Tackling Stochasticity in
Multi-Step Regression Stock Price Prediction [54.21695754082441]
Multi-step stock price prediction over a long-term horizon is crucial for forecasting its volatility.
Current solutions to multi-step stock price prediction are mostly designed for single-step, classification-based predictions.
We combine a deep hierarchical variational-autoencoder (VAE) and diffusion probabilistic techniques to do seq2seq stock prediction.
Our model is shown to outperform state-of-the-art solutions in terms of its prediction accuracy and variance.
arXiv Detail & Related papers (2023-08-18T16:21:15Z) - Few-shot learning through contextual data augmentation [74.20290390065475]
Machine translation models need to adapt to new data to maintain their performance over time.
We show that adaptation on the scale of one to five examples is possible.
Our model reports better accuracy scores than a reference system trained with on average 313 parallel examples.
arXiv Detail & Related papers (2021-03-31T09:05:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.