Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow
- URL: http://arxiv.org/abs/2407.18103v2
- Date: Mon, 5 Aug 2024 11:13:57 GMT
- Title: Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow
- Authors: Tian Guo, Emmanuel Hauptmann,
- Abstract summary: Large language models (LLMs) and their fine-tuning techniques have demonstrated superior performance in various language understanding and generation tasks.
This paper explores fine-tuning LLMs for stock return forecasting with financial newsflow.
- Score: 3.3218982605811114
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large language models (LLMs) and their fine-tuning techniques have demonstrated superior performance in various language understanding and generation tasks. This paper explores fine-tuning LLMs for stock return forecasting with financial newsflow. In quantitative investing, return forecasting is fundamental for subsequent tasks like stock picking, portfolio optimization, etc. We formulate the model to include text representation and forecasting modules. We propose to compare the encoder-only and decoder-only LLMs, considering they generate text representations in distinct ways. The impact of these different representations on forecasting performance remains an open question. Meanwhile, we compare two simple methods of integrating LLMs' token-level representations into the forecasting module. The experiments on real news and investment universes reveal that: (1) aggregated representations from LLMs' token-level embeddings generally produce return predictions that enhance the performance of long-only and long-short portfolios; (2) in the relatively large investment universe, the decoder LLMs-based prediction model leads to stronger portfolios, whereas in the small universes, there are no consistent winners. Among the three LLMs studied (DeBERTa, Mistral, Llama), Mistral performs more robustly across different universes; (3) return predictions derived from LLMs' text representations are a strong signal for portfolio construction, outperforming conventional sentiment scores.
Related papers
- Flipping Knowledge Distillation: Leveraging Small Models' Expertise to Enhance LLMs in Text Matching [16.725632407644884]
We introduce a flipped knowledge distillation paradigm, where a Large Language Model learns from a Smaller Language Model.<n>Specifically, we address the architectural gap between decoder-only LLMs and smaller encoder-based models.<n> Experiments on financial and healthcare benchmarks, as well as real-world applications, confirm its effectiveness.
arXiv Detail & Related papers (2025-07-08T02:54:15Z) - Explainable Multi-modal Time Series Prediction with LLM-in-the-Loop [63.34626300024294]
TimeXL is a multi-modal prediction framework that integrates a prototype-based time series encoder.
It produces more accurate predictions and interpretable explanations.
Empirical evaluations on four real-world datasets demonstrate that TimeXL achieves up to 8.9% improvement in AUC.
arXiv Detail & Related papers (2025-03-02T20:40:53Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.
LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.
Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - Accelerating Multimodal Large Language Models via Dynamic Visual-Token Exit and the Empirical Findings [69.35226485836641]
Excessive use of visual tokens in existing Multimoal Large Language Models (MLLMs) often exhibits obvious redundancy and brings in prohibitively expensive computation.
We propose a simple yet effective method to improve the efficiency of MLLMs, termed dynamic visual-token exit (DyVTE)
DyVTE uses lightweight hyper-networks to perceive the text token status and decide the removal of all visual tokens after a certain layer.
arXiv Detail & Related papers (2024-11-29T11:24:23Z) - BreakGPT: Leveraging Large Language Models for Predicting Asset Price Surges [55.2480439325792]
This paper introduces BreakGPT, a novel large language model (LLM) architecture adapted specifically for time series forecasting and the prediction of sharp upward movements in asset prices.
We showcase BreakGPT as a promising solution for financial forecasting with minimal training and as a strong competitor for capturing both local and global temporal dependencies.
arXiv Detail & Related papers (2024-11-09T05:40:32Z) - A Law of Next-Token Prediction in Large Language Models [30.265295018979078]
We introduce a precise and quantitative law that governs the learning of contextualized token embeddings through intermediate layers in pre-trained large language models.
Our findings reveal that each layer contributes equally to enhancing prediction accuracy, from the lowest to the highest layer.
arXiv Detail & Related papers (2024-08-24T02:48:40Z) - SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning [70.21358720599821]
Large language models (LLMs) hold the promise of solving diverse tasks when provided with appropriate natural language prompts.
We propose SELF-GUIDE, a multi-stage mechanism in which we synthesize task-specific input-output pairs from the student LLM.
We report an absolute improvement of approximately 15% for classification tasks and 18% for generation tasks in the benchmark's metrics.
arXiv Detail & Related papers (2024-07-16T04:41:58Z) - SLMRec: Empowering Small Language Models for Sequential Recommendation [38.51895517016953]
Sequential Recommendation task involves predicting the next item a user is likely to interact with, given their past interactions.
Recent research demonstrates the great impact of LLMs on sequential recommendation systems.
Due to the huge size of LLMs, it is inefficient and impractical to apply a LLM-based model in real-world platforms.
arXiv Detail & Related papers (2024-05-28T07:12:06Z) - LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders [34.421335513040795]
Large decoder-only language models (LLMs) are the state-of-the-art models on most of today's NLP tasks and benchmarks.
We introduce LLM2Vec, a simple unsupervised approach that can transform any decoder-only LLM into a strong text encoder.
arXiv Detail & Related papers (2024-04-09T02:51:05Z) - Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities [46.02234423159257]
Large language models (LLMs) have been applied in many fields and have developed rapidly in recent years.
Recent works treat large language models as emphzero-shot time series reasoners without further fine-tuning.
Our study shows that LLMs perform well in predicting time series with clear patterns and trends, but face challenges with datasets lacking periodicity.
arXiv Detail & Related papers (2024-02-16T17:15:28Z) - Large Language Models: A Survey [69.72787936480394]
Large Language Models (LLMs) have drawn a lot of attention due to their strong performance on a wide range of natural language tasks.
LLMs' ability of general-purpose language understanding and generation is acquired by training billions of model's parameters on massive amounts of text data.
arXiv Detail & Related papers (2024-02-09T05:37:09Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z) - Integrating Stock Features and Global Information via Large Language
Models for Enhanced Stock Return Prediction [5.762650600435391]
We propose a novel framework consisting of two components to surmount the challenges of integrating Large Language Models with existing quantitative models.
We have demonstrated superior performance in Rank Information Coefficient and returns, particularly compared to models relying only on stock features in the China A-share market.
arXiv Detail & Related papers (2023-10-09T11:34:18Z) - Evaluating and Explaining Large Language Models for Code Using Syntactic
Structures [74.93762031957883]
This paper introduces ASTxplainer, an explainability method specific to Large Language Models for code.
At its core, ASTxplainer provides an automated method for aligning token predictions with AST nodes.
We perform an empirical evaluation on 12 popular LLMs for code using a curated dataset of the most popular GitHub projects.
arXiv Detail & Related papers (2023-08-07T18:50:57Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines.
We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.