Enhancing Few-Shot Stock Trend Prediction with Large Language Models
- URL: http://arxiv.org/abs/2407.09003v1
- Date: Fri, 12 Jul 2024 05:43:11 GMT
- Title: Enhancing Few-Shot Stock Trend Prediction with Large Language Models
- Authors: Yiqi Deng, Xingwei He, Jiahao Hu, Siu-Ming Yiu,
- Abstract summary: Existing methods mostly focus on predicting stock trends with supervised models trained on extensive annotated data.
We propose using Large Language Models (LLMs) in a few-shot setting to overcome the scarcity of labeled data.
Our method achieves 66.59% accuracy in S&P 500, 62.17% in CSI-100, and 61.17% in HK stock prediction.
- Score: 9.439423168290011
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The goal of stock trend prediction is to forecast future market movements for informed investment decisions. Existing methods mostly focus on predicting stock trends with supervised models trained on extensive annotated data. However, human annotation can be resource-intensive and the annotated data are not readily available. Inspired by the impressive few-shot capability of Large Language Models (LLMs), we propose using LLMs in a few-shot setting to overcome the scarcity of labeled data and make prediction more feasible to investors. Previous works typically merge multiple financial news for predicting stock trends, causing two significant problems when using LLMs: (1) Merged news contains noise, and (2) it may exceed LLMs' input limits, leading to performance degradation. To overcome these issues, we propose a two-step method 'denoising-then-voting'. Specifically, we introduce an `Irrelevant' category, and predict stock trends for individual news instead of merged news. Then we aggregate these predictions using majority voting. The proposed method offers two advantages: (1) Classifying noisy news as irrelevant removes its impact on the final prediction. (2) Predicting for individual news mitigates LLMs' input length limits. Our method achieves 66.59% accuracy in S&P 500, 62.17% in CSI-100, and 61.17% in HK stock prediction, outperforming the standard few-shot counterparts by around 7%, 4%, and 4%. Furthermore, our proposed method performs on par with state-of-the-art supervised methods.
Related papers
- An End-to-End Structure with Novel Position Mechanism and Improved EMD for Stock Forecasting [1.7044651160538948]
Existing research mostly focuses on individual stock information but ignores stock market information and high noise in stock data.
We propose a novel method using the attention mechanism in which both stock market information and individual stock information are considered.
arXiv Detail & Related papers (2024-03-25T15:23:22Z) - Wisdom of the Silicon Crowd: LLM Ensemble Prediction Capabilities Rival Human Crowd Accuracy [1.999925939110439]
We use an ensemble approach consisting of a crowd of twelve large language models (LLMs)
We compare the aggregated LLM predictions on 31 binary questions to that of a crowd of human forecasters from a three-month forecasting tournament.
We find that both models' forecasting accuracy benefits from exposure to the median human prediction as information.
arXiv Detail & Related papers (2024-02-29T17:27:59Z) - Forecasting Cryptocurrency Prices Using Deep Learning: Integrating
Financial, Blockchain, and Text Data [3.8443430569753025]
We analyse the influence of public sentiment on cryptocurrency valuations using advanced deep learning NLP methods.
We compare the performance of various ML models, both with and without NLP data integration.
We discover that pre-trained models, such as Twitter-RoBERTa and BART MNLI, are highly effective in capturing market sentiment.
arXiv Detail & Related papers (2023-11-23T16:14:44Z) - Adaptation with Self-Evaluation to Improve Selective Prediction in LLMs [56.526095828316386]
We propose a novel framework for adaptation with self-evaluation to improve the selective prediction performance of large language models (LLMs)
We evaluate our method on a variety of question-answering (QA) datasets and show that it outperforms state-of-the-art selective prediction methods.
arXiv Detail & Related papers (2023-10-18T03:34:59Z) - Diffusion Variational Autoencoder for Tackling Stochasticity in
Multi-Step Regression Stock Price Prediction [54.21695754082441]
Multi-step stock price prediction over a long-term horizon is crucial for forecasting its volatility.
Current solutions to multi-step stock price prediction are mostly designed for single-step, classification-based predictions.
We combine a deep hierarchical variational-autoencoder (VAE) and diffusion probabilistic techniques to do seq2seq stock prediction.
Our model is shown to outperform state-of-the-art solutions in terms of its prediction accuracy and variance.
arXiv Detail & Related papers (2023-08-18T16:21:15Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and
Large Language Models [57.70351255180495]
We use ChatGPT to assess whether each headline is good, bad, or neutral for firms' stock prices.
We find that ChatGPT outperforms traditional sentiment analysis methods.
Long-short strategies based on ChatGPT-4 deliver the highest Sharpe ratio.
arXiv Detail & Related papers (2023-04-15T19:22:37Z) - nanoLM: an Affordable LLM Pre-training Benchmark via Accurate Loss Prediction across Scales [65.01417261415833]
We present an approach to predict the pre-training loss based on our observations that Maximal Update Parametrization (muP) enables accurate fitting of scaling laws.
With around 14% of the one-time pre-training cost, we can accurately forecast the loss for models up to 52B.
Our goal with nanoLM is to empower researchers with limited resources to reach meaningful conclusions on large models.
arXiv Detail & Related papers (2023-04-14T00:45:01Z) - Taming Overconfident Prediction on Unlabeled Data from Hindsight [50.9088560433925]
Minimizing prediction uncertainty on unlabeled data is a key factor to achieve good performance in semi-supervised learning.
This paper proposes a dual mechanism, named ADaptive Sharpening (ADS), which first applies a soft-threshold to adaptively mask out determinate and negligible predictions.
ADS significantly improves the state-of-the-art SSL methods by making it a plug-in.
arXiv Detail & Related papers (2021-12-15T15:17:02Z) - Interpretability in Safety-Critical FinancialTrading Systems [15.060749321774136]
In 2020, some of the world's most sophisticated quant hedge funds suffered losses.
We implement a gradient-based approach for precisely stress-testing how a trading model's forecasts can be manipulated.
We find our approach discovers seemingly in-sample input settings that result in large negative shifts in return distributions.
arXiv Detail & Related papers (2021-09-24T17:05:58Z) - REST: Relational Event-driven Stock Trend Forecasting [76.08435590771357]
We propose a relational event-driven stock trend forecasting (REST) framework, which can address the shortcoming of existing methods.
To remedy the first shortcoming, we propose to model the stock context and learn the effect of event information on the stocks under different contexts.
To address the second shortcoming, we construct a stock graph and design a new propagation layer to propagate the effect of event information from related stocks.
arXiv Detail & Related papers (2021-02-15T07:22:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.