News Sentiment Embeddings for Stock Price Forecasting
- URL: http://arxiv.org/abs/2507.01970v1
- Date: Thu, 19 Jun 2025 17:30:07 GMT
- Title: News Sentiment Embeddings for Stock Price Forecasting
- Authors: Ayaan Qayyum,
- Abstract summary: Key focus is to use news headlines from the Wall Street Journal to predict the movement of stock prices on a daily timescale.<n>Preliminary results show that headline data embeddings greatly benefit stock price prediction by at least 40% compared to training and optimizing a machine learning system.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper will discuss how headline data can be used to predict stock prices. The stock price in question is the SPDR S&P 500 ETF Trust, also known as SPY that tracks the performance of the largest 500 publicly traded corporations in the United States. A key focus is to use news headlines from the Wall Street Journal (WSJ) to predict the movement of stock prices on a daily timescale with OpenAI-based text embedding models used to create vector encodings of each headline with principal component analysis (PCA) to exact the key features. The challenge of this work is to capture the time-dependent and time-independent, nuanced impacts of news on stock prices while handling potential lag effects and market noise. Financial and economic data were collected to improve model performance; such sources include the U.S. Dollar Index (DXY) and Treasury Interest Yields. Over 390 machine-learning inference models were trained. The preliminary results show that headline data embeddings greatly benefit stock price prediction by at least 40% compared to training and optimizing a machine learning system without headline data embeddings.
Related papers
- Multimodal Stock Price Prediction [0.0]
It has become increasingly critical to carefully integrate diverse data sources with machine learning for accurate stock price prediction.<n>This paper explores a multimodal machine learning approach for stock price prediction by combining data from diverse sources, including traditional financial metrics, tweets, and news articles.
arXiv Detail & Related papers (2025-01-23T16:38:46Z) - AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework [48.3060010653088]
We release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data.
We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task.
arXiv Detail & Related papers (2024-03-19T09:45:33Z) - American Option Pricing using Self-Attention GRU and Shapley Value
Interpretation [0.0]
We propose a machine learning method for forecasting the prices of SPY (ETF) option based on gated recurrent unit (GRU) and self-attention mechanism.
We built four different machine learning models, including multilayer perceptron (MLP), long short-term memory (LSTM), self-attention LSTM, and self-attention GRU.
arXiv Detail & Related papers (2023-10-19T06:05:46Z) - Diffusion Variational Autoencoder for Tackling Stochasticity in
Multi-Step Regression Stock Price Prediction [54.21695754082441]
Multi-step stock price prediction over a long-term horizon is crucial for forecasting its volatility.
Current solutions to multi-step stock price prediction are mostly designed for single-step, classification-based predictions.
We combine a deep hierarchical variational-autoencoder (VAE) and diffusion probabilistic techniques to do seq2seq stock prediction.
Our model is shown to outperform state-of-the-art solutions in terms of its prediction accuracy and variance.
arXiv Detail & Related papers (2023-08-18T16:21:15Z) - Effects of Daily News Sentiment on Stock Price Forecasting [0.5242869847419834]
This paper presents a robust data collection and preprocessing framework to create a news database for a timeline of around 3.7 years.
We capture the stock price information for this timeline and create multiple time series data, that include the sentiment scores from various sections of the article.
Based on this, we fit several LSTM models to forecast the stock prices, with and without using the sentiment scores as features and compare their performances.
arXiv Detail & Related papers (2023-08-02T06:42:39Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines.
We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z) - S&P 500 Stock Price Prediction Using Technical, Fundamental and Text
Data [5.420890357732937]
We summarized both common and novel predictive models used for stock price prediction.
We combined them with technical indices, fundamental characteristics and text-based sentiment data to predict S&P stock prices.
A 66.18% accuracy in S&P 500 index directional prediction and 62.09% accuracy in individual stock directional prediction was achieved.
arXiv Detail & Related papers (2021-08-24T16:18:52Z) - Graph-Based Learning for Stock Movement Prediction with Textual and
Relational Data [0.0]
We propose a new stock movement prediction framework: Multi-Graph Recurrent Network for Stock Forecasting (MGRN)
This architecture allows to combine the textual sentiment from financial news and multiple relational information extracted from other financial data.
Through an accuracy test and a trading simulation on the stocks in the STOXX Europe 600 index, we demonstrate a better performance from our model than other benchmarks.
arXiv Detail & Related papers (2021-07-22T21:57:18Z) - REST: Relational Event-driven Stock Trend Forecasting [76.08435590771357]
We propose a relational event-driven stock trend forecasting (REST) framework, which can address the shortcoming of existing methods.
To remedy the first shortcoming, we propose to model the stock context and learn the effect of event information on the stocks under different contexts.
To address the second shortcoming, we construct a stock graph and design a new propagation layer to propagate the effect of event information from related stocks.
arXiv Detail & Related papers (2021-02-15T07:22:09Z) - A Sentiment Analysis Approach to the Prediction of Market Volatility [62.997667081978825]
We have explored the relationship between sentiment extracted from financial news and tweets and FTSE100 movements.
The sentiment captured from news headlines could be used as a signal to predict market returns; the same does not apply for volatility.
We developed an accurate classifier for the prediction of market volatility in response to the arrival of new information.
arXiv Detail & Related papers (2020-12-10T01:15:48Z) - Evaluating data augmentation for financial time series classification [85.38479579398525]
We evaluate several augmentation methods applied to stocks datasets using two state-of-the-art deep learning models.
For a relatively small dataset augmentation methods achieve up to $400%$ improvement in risk adjusted return performance.
For a larger stock dataset augmentation methods achieve up to $40%$ improvement.
arXiv Detail & Related papers (2020-10-28T17:53:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.