A three-step machine learning approach to predict market bubbles with financial news
- URL: http://arxiv.org/abs/2510.16636v1
- Date: Sat, 18 Oct 2025 20:31:31 GMT
- Title: A three-step machine learning approach to predict market bubbles with financial news
- Authors: Abraham Atsiwo,
- Abstract summary: This study presents a three-step machine learning framework to predict bubbles in the S&P 500 stock market by combining financial news sentiment with macroeconomic indicators.<n>In the first step, bubble periods in the S&P 500 index are identified using a right-tailed unit root test, a widely recognized real-time bubble detection method.<n>The second step extracts sentiment features from large-scale financial news articles using natural language processing (NLP) techniques.<n>In the final step, ensemble learning methods are applied to predict bubble occurrences based on high sentiment-based and macroeconomic predictors.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This study presents a three-step machine learning framework to predict bubbles in the S&P 500 stock market by combining financial news sentiment with macroeconomic indicators. Building on traditional econometric approaches, the proposed approach predicts bubble formation by integrating textual and quantitative data sources. In the first step, bubble periods in the S&P 500 index are identified using a right-tailed unit root test, a widely recognized real-time bubble detection method. The second step extracts sentiment features from large-scale financial news articles using natural language processing (NLP) techniques, which capture investors' expectations and behavioral patterns. In the final step, ensemble learning methods are applied to predict bubble occurrences based on high sentiment-based and macroeconomic predictors. Model performance is evaluated through k-fold cross-validation and compared against benchmark machine learning algorithms. Empirical results indicate that the proposed three-step ensemble approach significantly improves predictive accuracy and robustness, providing valuable early warning insights for investors, regulators, and policymakers in mitigating systemic financial risks.
Related papers
- $φ$-DPO: Fairness Direct Preference Optimization Approach to Continual Learning in Large Multimodal Models [58.217707070069885]
This paper presents a novel Fairness Direct Preference Optimization (FaiDPO or $$-DPO) framework for continual learning in LMMs.<n>We first propose a new continual learning paradigm based on Direct Preference Optimization (DPO) to mitigate catastrophic forgetting by aligning learning with pairwise preference signals.<n> Extensive experiments and ablation studies show the proposed $$-DPO achieves State-of-the-Art performance across multiple benchmarks.
arXiv Detail & Related papers (2026-02-26T04:14:33Z) - Forecasting Future Language: Context Design for Mention Markets [81.25011140991566]
We study how input context should be designed to support accurate prediction in mention markets.<n>We find three insights: (1) richer context consistently improves forecasting performance; (2) market-conditioned prompting (MCP) treats the market probability as a prior and updates it using textual evidence, yields better-calibrated forecasts; and (3) a mixture of the market probability and MCP (MixMCP) outperforms the market baseline.
arXiv Detail & Related papers (2026-02-04T12:43:31Z) - Enhancing Forex Forecasting Accuracy: The Impact of Hybrid Variable Sets in Cognitive Algorithmic Trading Systems [0.0]
This paper presents the implementation of an advanced artificial intelligence-based algorithmic trading system specifically designed for the EUR-USD pair.<n>The methodological approach centers on integrating a holistic set of input features.<n>The performance of the resulting algorithm is evaluated using standard machine learning metrics.
arXiv Detail & Related papers (2025-11-20T18:58:22Z) - Probabilistic Forecasting Cryptocurrencies Volatility: From Point to Quantile Forecasts [1.8352113484137627]
This paper introduces probabilistic forecasting methods that leverage point forecasts from a wide range of base models.<n>To the best of our knowledge, this is the first study in the literature to propose and systematically evaluate probabilistic forecasts of variance in cryptocurrency markets.<n>Our empirical results for Bitcoin demonstrate that the Quantile Estimation through Residual Simulation (QRS) method consistently outperforms more sophisticated alternatives.
arXiv Detail & Related papers (2025-08-21T18:42:11Z) - FinTSB: A Comprehensive and Practical Benchmark for Financial Time Series Forecasting [58.70072722290475]
Financial time series (FinTS) record the behavior of human-brain-augmented decision-making.<n>FinTSB is a comprehensive and practical benchmark for financial time series forecasting.
arXiv Detail & Related papers (2025-02-26T05:19:16Z) - Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach [6.112119533910774]
This paper introduces an advanced approach by employing Large Language Models (LLMs) instruction fine-tuned with a novel combination of instruction-based techniques and quantized low-rank adaptation (QLoRA) compression.
Our methodology integrates 'base factors', such as financial metric growth and earnings transcripts, with 'external factors', including recent market indices performances and analyst grades, to create a rich, supervised dataset.
This study not only demonstrates the power of integrating cutting-edge AI with fine-tuned financial data but also paves the way for future research in enhancing AI-driven financial analysis tools.
arXiv Detail & Related papers (2024-08-13T04:53:31Z) - Diffusion Variational Autoencoder for Tackling Stochasticity in
Multi-Step Regression Stock Price Prediction [54.21695754082441]
Multi-step stock price prediction over a long-term horizon is crucial for forecasting its volatility.
Current solutions to multi-step stock price prediction are mostly designed for single-step, classification-based predictions.
We combine a deep hierarchical variational-autoencoder (VAE) and diffusion probabilistic techniques to do seq2seq stock prediction.
Our model is shown to outperform state-of-the-art solutions in terms of its prediction accuracy and variance.
arXiv Detail & Related papers (2023-08-18T16:21:15Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines.
We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z) - Deep learning based Chinese text sentiment mining and stock market
correlation research [6.000327333763521]
We explore how to crawl financial forum data such as stock bars and combine them with deep learning models for sentiment analysis.
In this paper, we will use the BERT model to train against the financial corpus and predict the SZSE Component Index.
The obtained sentiment features will be able to reflect the fluctuations in the stock market and help to improve the prediction accuracy effectively.
arXiv Detail & Related papers (2022-05-10T08:35:33Z) - Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics
in Limit-Order Book Markets [84.90242084523565]
Traditional time-series econometric methods often appear incapable of capturing the true complexity of the multi-level interactions driving the price dynamics.
By adopting a state-of-the-art second-order optimization algorithm, we train a Bayesian bilinear neural network with temporal attention.
By addressing the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecasts, we thoroughly compare our Bayesian model with traditional ML alternatives.
arXiv Detail & Related papers (2022-03-07T18:59:54Z) - Interpretable ML-driven Strategy for Automated Trading Pattern
Extraction [2.7910505923792646]
We propose a volume-based data pre-processing method for financial time series analysis.
We use a statistical approach for assessing the performance of the method.
Our analysis shows that the proposed method allows successful classification of the financial time series patterns.
arXiv Detail & Related papers (2021-03-23T09:55:46Z) - A Sentiment Analysis Approach to the Prediction of Market Volatility [62.997667081978825]
We have explored the relationship between sentiment extracted from financial news and tweets and FTSE100 movements.
The sentiment captured from news headlines could be used as a signal to predict market returns; the same does not apply for volatility.
We developed an accurate classifier for the prediction of market volatility in response to the arrival of new information.
arXiv Detail & Related papers (2020-12-10T01:15:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.