Stock Market Sentiment Classification and Backtesting via Fine-tuned
BERT
- URL: http://arxiv.org/abs/2309.11979v1
- Date: Thu, 21 Sep 2023 11:26:36 GMT
- Title: Stock Market Sentiment Classification and Backtesting via Fine-tuned
BERT
- Authors: Jiashu Lou
- Abstract summary: This paper starts from the theory of emotion, taking East Money as an example, crawling user comment titles data from its corresponding stock bar.
Based on the above model, the user comment data crawled is labeled with emotional polarity, and the obtained label information is combined with the Alpha191 model.
The regression model is used to predict the average price change for the next five days, and use it as a signal to guide automatic trading.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the rapid development of big data and computing devices, low-latency
automatic trading platforms based on real-time information acquisition have
become the main components of the stock trading market, so the topic of
quantitative trading has received widespread attention. And for non-strongly
efficient trading markets, human emotions and expectations always dominate
market trends and trading decisions. Therefore, this paper starts from the
theory of emotion, taking East Money as an example, crawling user comment
titles data from its corresponding stock bar and performing data cleaning.
Subsequently, a natural language processing model BERT was constructed, and the
BERT model was fine-tuned using existing annotated data sets. The experimental
results show that the fine-tuned model has different degrees of performance
improvement compared to the original model and the baseline model.
Subsequently, based on the above model, the user comment data crawled is
labeled with emotional polarity, and the obtained label information is combined
with the Alpha191 model to participate in regression, and significant
regression results are obtained. Subsequently, the regression model is used to
predict the average price change for the next five days, and use it as a signal
to guide automatic trading. The experimental results show that the
incorporation of emotional factors increased the return rate by 73.8\% compared
to the baseline during the trading period, and by 32.41\% compared to the
original alpha191 model. Finally, we discuss the advantages and disadvantages
of incorporating emotional factors into quantitative trading, and give possible
directions for further research in the future.
Related papers
- Trading through Earnings Seasons using Self-Supervised Contrastive Representation Learning [1.6574413179773761]
Contrastive Earnings Transformer (CET) is a self-supervised learning approach rooted in Contrastive Predictive Coding (CPC)
Our research delves deep into the intricacies of stock data, evaluating how various models handle the rapidly changing relevance of earnings data over time and over different sectors.
CET's foundation on CPC allows for a nuanced understanding, facilitating consistent stock predictions even as the earnings data ages.
arXiv Detail & Related papers (2024-09-25T22:09:59Z) - When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments [55.19252983108372]
We have developed a multi-agent AI system called StockAgent, driven by LLMs.
The StockAgent allows users to evaluate the impact of different external factors on investor trading.
It avoids the test set leakage issue present in existing trading simulation systems based on AI Agents.
arXiv Detail & Related papers (2024-07-15T06:49:30Z) - F-FOMAML: GNN-Enhanced Meta-Learning for Peak Period Demand Forecasting with Proxy Data [65.6499834212641]
We formulate the demand prediction as a meta-learning problem and develop the Feature-based First-Order Model-Agnostic Meta-Learning (F-FOMAML) algorithm.
By considering domain similarities through task-specific metadata, our model improved generalization, where the excess risk decreases as the number of training tasks increases.
Compared to existing state-of-the-art models, our method demonstrates a notable improvement in demand prediction accuracy, reducing the Mean Absolute Error by 26.24% on an internal vending machine dataset and by 1.04% on the publicly accessible JD.com dataset.
arXiv Detail & Related papers (2024-06-23T21:28:50Z) - NoxTrader: LSTM-Based Stock Return Momentum Prediction for Quantitative
Trading [0.0]
NoxTrader is a sophisticated system designed for portfolio construction and trading execution.
The underlying learning process of NoxTrader is rooted in the assimilation of valuable insights derived from historical trading data.
Our rigorous feature engineering and careful selection of prediction targets enable us to generate prediction data with an impressive correlation range between 0.65 and 0.75.
arXiv Detail & Related papers (2023-10-01T17:53:23Z) - HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and
Regime-Switch VAE [113.47287249524008]
It is still an open question to build a factor model that can conduct stock prediction in an online and adaptive setting.
We propose the first deep learning based online and adaptive factor model, HireVAE, at the core of which is a hierarchical latent space that embeds the relationship between the market situation and stock-wise latent factors.
Across four commonly used real stock market benchmarks, the proposed HireVAE demonstrate superior performance in terms of active returns over previous methods.
arXiv Detail & Related papers (2023-06-05T12:58:13Z) - Predicting Day-Ahead Stock Returns using Search Engine Query Volumes: An
Application of Gradient Boosted Decision Trees to the S&P 100 [0.0]
This paper aims to answer the question whether this information can be facilitated to predict future returns of stocks on financial capital markets.
It implements gradient boosted decision trees to learn relationships between abnormal returns of stocks within the S&P 100 index and lagged predictors derived from historical financial data.
It gives guidance on how to use and transform data on internet usage behavior for financial and economic modeling and forecasting.
arXiv Detail & Related papers (2022-05-31T14:58:46Z) - Back2Future: Leveraging Backfill Dynamics for Improving Real-time
Predictions in Future [73.03458424369657]
In real-time forecasting in public health, data collection is a non-trivial and demanding task.
'Backfill' phenomenon and its effect on model performance has been barely studied in the prior literature.
We formulate a novel problem and neural framework Back2Future that aims to refine a given model's predictions in real-time.
arXiv Detail & Related papers (2021-06-08T14:48:20Z) - Feature Learning for Stock Price Prediction Shows a Significant Role of
Analyst Rating [0.38073142980733]
A set of 5 technical indicators and 23 fundamental indicators was identified to establish the possibility of generating excess returns on the stock market.
From any given day, we were able to predict the direction of change in price by 1% up to 10 days in the future.
The predictions had an overall accuracy of 83.62% with a precision of 85% for buy signals and a recall of 100% for sell signals.
arXiv Detail & Related papers (2021-03-13T03:56:29Z) - A Sentiment Analysis Approach to the Prediction of Market Volatility [62.997667081978825]
We have explored the relationship between sentiment extracted from financial news and tweets and FTSE100 movements.
The sentiment captured from news headlines could be used as a signal to predict market returns; the same does not apply for volatility.
We developed an accurate classifier for the prediction of market volatility in response to the arrival of new information.
arXiv Detail & Related papers (2020-12-10T01:15:48Z) - Semi-Supervised Models via Data Augmentationfor Classifying Interactive
Affective Responses [85.04362095899656]
We present semi-supervised models with data augmentation (SMDA), a semi-supervised text classification system to classify interactive affective responses.
For labeled sentences, we performed data augmentations to uniform the label distributions and computed supervised loss during training process.
For unlabeled sentences, we explored self-training by regarding low-entropy predictions over unlabeled sentences as pseudo labels.
arXiv Detail & Related papers (2020-04-23T05:02:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.