Predicting Day-Ahead Stock Returns using Search Engine Query Volumes: An
Application of Gradient Boosted Decision Trees to the S&P 100
- URL: http://arxiv.org/abs/2205.15853v2
- Date: Wed, 1 Jun 2022 09:16:50 GMT
- Title: Predicting Day-Ahead Stock Returns using Search Engine Query Volumes: An
Application of Gradient Boosted Decision Trees to the S&P 100
- Authors: Christopher Bockel-Rickermann
- Abstract summary: This paper aims to answer the question whether this information can be facilitated to predict future returns of stocks on financial capital markets.
It implements gradient boosted decision trees to learn relationships between abnormal returns of stocks within the S&P 100 index and lagged predictors derived from historical financial data.
It gives guidance on how to use and transform data on internet usage behavior for financial and economic modeling and forecasting.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The internet has changed the way we live, work and take decisions. As it is
the major modern resource for research, detailed data on internet usage
exhibits vast amounts of behavioral information. This paper aims to answer the
question whether this information can be facilitated to predict future returns
of stocks on financial capital markets. In an empirical analysis it implements
gradient boosted decision trees to learn relationships between abnormal returns
of stocks within the S&P 100 index and lagged predictors derived from
historical financial data, as well as search term query volumes on the internet
search engine Google. Models predict the occurrence of day-ahead stock returns
in excess of the index median. On a time frame from 2005 to 2017, all disparate
datasets exhibit valuable information. Evaluated models have average areas
under the receiver operating characteristic between 54.2% and 56.7%, clearly
indicating a classification better than random guessing. Implementing a simple
statistical arbitrage strategy, models are used to create daily trading
portfolios of ten stocks and result in annual performances of more than 57%
before transaction costs. With ensembles of different data sets topping up the
performance ranking, the results further question the weak form and semi-strong
form efficiency of modern financial capital markets. Even though transaction
costs are not included, the approach adds to the existing literature. It gives
guidance on how to use and transform data on internet usage behavior for
financial and economic modeling and forecasting.
Related papers
- AI in Investment Analysis: LLMs for Equity Stock Ratings [0.2916558661202724]
This paper explores the application of Large Language Models (LLMs) to generate multi-horizon stock ratings.
Our study addresses these issues by leveraging LLMs to improve the accuracy and consistency of stock ratings.
Our results show that our benchmark method outperforms traditional stock rating methods when assessed by forward returns.
arXiv Detail & Related papers (2024-10-30T15:06:57Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - Stock Market Sentiment Classification and Backtesting via Fine-tuned
BERT [0.0]
This paper starts from the theory of emotion, taking East Money as an example, crawling user comment titles data from its corresponding stock bar.
Based on the above model, the user comment data crawled is labeled with emotional polarity, and the obtained label information is combined with the Alpha191 model.
The regression model is used to predict the average price change for the next five days, and use it as a signal to guide automatic trading.
arXiv Detail & Related papers (2023-09-21T11:26:36Z) - HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and
Regime-Switch VAE [113.47287249524008]
It is still an open question to build a factor model that can conduct stock prediction in an online and adaptive setting.
We propose the first deep learning based online and adaptive factor model, HireVAE, at the core of which is a hierarchical latent space that embeds the relationship between the market situation and stock-wise latent factors.
Across four commonly used real stock market benchmarks, the proposed HireVAE demonstrate superior performance in terms of active returns over previous methods.
arXiv Detail & Related papers (2023-06-05T12:58:13Z) - DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions [53.37679435230207]
We propose DeepVol, a model based on Dilated Causal Convolutions that uses high-frequency data to forecast day-ahead volatility.
Our empirical results suggest that the proposed deep learning-based approach effectively learns global features from high-frequency data.
arXiv Detail & Related papers (2022-09-23T16:13:47Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - Predicting Stock Price Movement after Disclosure of Corporate Annual
Reports: A Case Study of 2021 China CSI 300 Stocks [4.5885930040346565]
This work study the predicting the tendency of the stock price on the second day right after the disclosure of the companies' annual reports.
We use a variety of different models, including decision tree, logistic regression, random forest, neural network, prototypical networks.
We conclude that according to the financial indicators based on the just-released annual report of the company, the predictability of the stock price movement is weak.
arXiv Detail & Related papers (2022-06-25T01:54:53Z) - S&P 500 Stock Price Prediction Using Technical, Fundamental and Text
Data [5.420890357732937]
We summarized both common and novel predictive models used for stock price prediction.
We combined them with technical indices, fundamental characteristics and text-based sentiment data to predict S&P stock prices.
A 66.18% accuracy in S&P 500 index directional prediction and 62.09% accuracy in individual stock directional prediction was achieved.
arXiv Detail & Related papers (2021-08-24T16:18:52Z) - Feature Learning for Stock Price Prediction Shows a Significant Role of
Analyst Rating [0.38073142980733]
A set of 5 technical indicators and 23 fundamental indicators was identified to establish the possibility of generating excess returns on the stock market.
From any given day, we were able to predict the direction of change in price by 1% up to 10 days in the future.
The predictions had an overall accuracy of 83.62% with a precision of 85% for buy signals and a recall of 100% for sell signals.
arXiv Detail & Related papers (2021-03-13T03:56:29Z) - Evaluating data augmentation for financial time series classification [85.38479579398525]
We evaluate several augmentation methods applied to stocks datasets using two state-of-the-art deep learning models.
For a relatively small dataset augmentation methods achieve up to $400%$ improvement in risk adjusted return performance.
For a larger stock dataset augmentation methods achieve up to $40%$ improvement.
arXiv Detail & Related papers (2020-10-28T17:53:57Z) - Super-App Behavioral Patterns in Credit Risk Models: Financial,
Statistical and Regulatory Implications [110.54266632357673]
We present the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models.
Our results, validated across two countries, show that these new sources of data are particularly useful for predicting financial behavior in low-wealth and young individuals.
arXiv Detail & Related papers (2020-05-09T01:32:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.