Astock: A New Dataset and Automated Stock Trading based on
Stock-specific News Analyzing Model
- URL: http://arxiv.org/abs/2206.06606v1
- Date: Tue, 14 Jun 2022 05:55:23 GMT
- Title: Astock: A New Dataset and Automated Stock Trading based on
Stock-specific News Analyzing Model
- Authors: Jinan Zou, Haiyao Cao, Lingqiao Liu, Yuhao Lin, Ehsan Abbasnejad,
Javen Qinfeng Shi
- Abstract summary: We build a platform to study the NLP-aided stock auto-trading algorithms systematically.
We provide financial news for each specific stock.
We provide various stock factors for each stock.
We evaluate performance from more financial-relevant metrics.
- Score: 21.05128751957895
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Natural Language Processing(NLP) demonstrates a great potential to support
financial decision-making by analyzing the text from social media or news
outlets. In this work, we build a platform to study the NLP-aided stock
auto-trading algorithms systematically. In contrast to the previous work, our
platform is characterized by three features: (1) We provide financial news for
each specific stock. (2) We provide various stock factors for each stock. (3)
We evaluate performance from more financial-relevant metrics. Such a design
allows us to develop and evaluate NLP-aided stock auto-trading algorithms in a
more realistic setting. In addition to designing an evaluation platform and
dataset collection, we also made a technical contribution by proposing a system
to automatically learn a good feature representation from various input
information. The key to our algorithm is a method called semantic role labeling
Pooling (SRLP), which leverages Semantic Role Labeling (SRL) to create a
compact representation of each news paragraph. Based on SRLP, we further
incorporate other stock factors to make the final prediction. In addition, we
propose a self-supervised learning strategy based on SRLP to enhance the
out-of-distribution generalization performance of our system. Through our
experimental study, we show that the proposed method achieves better
performance and outperforms all the baselines' annualized rate of return as
well as the maximum drawdown of the CSI300 index and XIN9 index on real
trading. Our Astock dataset and code are available at
https://github.com/JinanZou/Astock.
Related papers
- AI in Investment Analysis: LLMs for Equity Stock Ratings [0.2916558661202724]
This paper explores the application of Large Language Models (LLMs) to generate multi-horizon stock ratings.
Our study addresses these issues by leveraging LLMs to improve the accuracy and consistency of stock ratings.
Our results show that our benchmark method outperforms traditional stock rating methods when assessed by forward returns.
arXiv Detail & Related papers (2024-10-30T15:06:57Z) - TradExpert: Revolutionizing Trading with Mixture of Expert LLMs [25.243258134817054]
TradeExpert is a novel framework that employs a mix of experts (MoE) approach, using four specialized LLMs.
Our experimental results demonstrate TradeExpert's superior performance across all trading scenarios.
arXiv Detail & Related papers (2024-10-16T20:24:16Z) - Harnessing Earnings Reports for Stock Predictions: A QLoRA-Enhanced LLM Approach [6.112119533910774]
This paper introduces an advanced approach by employing Large Language Models (LLMs) instruction fine-tuned with a novel combination of instruction-based techniques and quantized low-rank adaptation (QLoRA) compression.
Our methodology integrates 'base factors', such as financial metric growth and earnings transcripts, with 'external factors', including recent market indices performances and analyst grades, to create a rich, supervised dataset.
This study not only demonstrates the power of integrating cutting-edge AI with fine-tuned financial data but also paves the way for future research in enhancing AI-driven financial analysis tools.
arXiv Detail & Related papers (2024-08-13T04:53:31Z) - LLM-Select: Feature Selection with Large Language Models [64.5099482021597]
Large language models (LLMs) are capable of selecting the most predictive features, with performance rivaling the standard tools of data science.
Our findings suggest that LLMs may be useful not only for selecting the best features for training but also for deciding which features to collect in the first place.
arXiv Detail & Related papers (2024-07-02T22:23:40Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework [48.3060010653088]
We release AlphaFin datasets, combining traditional research datasets, real-time financial data, and handwritten chain-of-thought (CoT) data.
We then use AlphaFin datasets to benchmark a state-of-the-art method, called Stock-Chain, for effectively tackling the financial analysis task.
arXiv Detail & Related papers (2024-03-19T09:45:33Z) - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models [52.98743860365194]
We propose a new fine-tuning method called Self-Play fIne-tuNing (SPIN)
At the heart of SPIN lies a self-play mechanism, where the LLM refines its capability by playing against instances of itself.
This sheds light on the promise of self-play, enabling the achievement of human-level performance in LLMs without the need for expert opponents.
arXiv Detail & Related papers (2024-01-02T18:53:13Z) - Integrating Stock Features and Global Information via Large Language
Models for Enhanced Stock Return Prediction [5.762650600435391]
We propose a novel framework consisting of two components to surmount the challenges of integrating Large Language Models with existing quantitative models.
We have demonstrated superior performance in Rank Information Coefficient and returns, particularly compared to models relying only on stock features in the China A-share market.
arXiv Detail & Related papers (2023-10-09T11:34:18Z) - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models [51.3422222472898]
We document the capability of large language models (LLMs) like ChatGPT to predict stock price movements using news headlines.
We develop a theoretical model incorporating information capacity constraints, underreaction, limits-to-arbitrage, and LLMs.
arXiv Detail & Related papers (2023-04-15T19:22:37Z) - Compatible deep neural network framework with financial time series
data, including data preprocessor, neural network model and trading strategy [2.347843817145202]
This research introduces a new deep neural network architecture and a novel idea of how to prepare financial data before feeding them to the model.
Three different datasets are used to evaluate this method, where results indicate that this framework can provide us with profitable and robust predictions.
arXiv Detail & Related papers (2022-05-11T20:44:08Z) - Towards Open-World Feature Extrapolation: An Inductive Graph Learning
Approach [80.8446673089281]
We propose a new learning paradigm with graph representation and learning.
Our framework contains two modules: 1) a backbone network (e.g., feedforward neural nets) as a lower model takes features as input and outputs predicted labels; 2) a graph neural network as an upper model learns to extrapolate embeddings for new features via message passing over a feature-data graph built from observed data.
arXiv Detail & Related papers (2021-10-09T09:02:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.