Related papers: Reinforcement Learning for Stock Transactions

Reinforcement Learning for Stock Transactions

URL: http://arxiv.org/abs/2505.16099v2
Date: Sat, 24 May 2025 01:44:46 GMT
Title: Reinforcement Learning for Stock Transactions
Authors: Ziyi Zhou, Nicholas Stern, Julien Laasri,
Abstract summary: We train a series of agents using Q-Learning, Q-Learning with linear function approximation, and deep Q-Learning.<n>We try to predict the stock prices using machine learning regression and classification models.
Score: 1.9578448731837585
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Much research has been done to analyze the stock market. After all, if one can determine a pattern in the chaotic frenzy of transactions, then they could make a hefty profit from capitalizing on these insights. As such, the goal of our project was to apply reinforcement learning (RL) to determine the best time to buy a stock within a given time frame. With only a few adjustments, our model can be extended to identify the best time to sell a stock as well. In order to use the format of free, real-world data to train the model, we define our own Markov Decision Process (MDP) problem. These two papers [5] [6] helped us in formulating the state space and the reward system of our MDP problem. We train a series of agents using Q-Learning, Q-Learning with linear function approximation, and deep Q-Learning. In addition, we try to predict the stock prices using machine learning regression and classification models. We then compare our agents to see if they converge on a policy, and if so, which one learned the best policy to maximize profit on the stock market.

Related papers

On Evaluating Loss Functions for Stock Ranking: An Empirical Analysis With Transformer Model [0.0]
Transformer models are promising for understanding financial time series.<n>But how different training loss functions affect their ability to rank stocks well is not yet fully understood.
arXiv Detail & Related papers (2025-10-15T23:06:02Z)
Self-Evolving Curriculum for LLM Reasoning [108.23021254812258]
Self-Evolving Curriculum (SEC) is an automatic curriculum learning method that learns a curriculum policy concurrently with the RL fine-tuning process.<n>Our experiments demonstrate that SEC significantly improves models' reasoning capabilities, enabling better generalization to harder, out-of-distribution test problems.
arXiv Detail & Related papers (2025-05-20T23:17:15Z)
Ranked from Within: Ranking Large Multimodal Models Without Labels [73.96543593298426]
We show that uncertainty scores derived from softmax distributions provide a robust basis for ranking models across various tasks.<n>This facilitates the ranking of LMMs on unlabeled data, providing a practical approach for selecting models for diverse target domains without requiring manual annotation.
arXiv Detail & Related papers (2024-12-09T13:05:43Z)
Dynamic Uncertainty Ranking: Enhancing Retrieval-Augmented In-Context Learning for Long-Tail Knowledge in LLMs [50.29035873837]
Large language models (LLMs) can learn vast amounts of knowledge from diverse domains during pre-training.<n>Long-tail knowledge from specialized domains is often scarce and underrepresented, rarely appearing in the models' memorization.<n>We propose a reinforcement learning-based dynamic uncertainty ranking method for ICL that accounts for the varying impact of each retrieved sample on LLM predictions.
arXiv Detail & Related papers (2024-10-31T03:42:17Z)
Value-Distributional Model-Based Reinforcement Learning [59.758009422067]
Quantifying uncertainty about a policy's long-term performance is important to solve sequential decision-making tasks. We study the problem from a model-based Bayesian reinforcement learning perspective. We propose Epistemic Quantile-Regression (EQR), a model-based algorithm that learns a value distribution function.
arXiv Detail & Related papers (2023-08-12T14:59:19Z)
Actions Speak What You Want: Provably Sample-Efficient Reinforcement Learning of the Quantal Stackelberg Equilibrium from Strategic Feedbacks [94.07688076435818]
We study reinforcement learning for learning a Quantal Stackelberg Equilibrium (QSE) in an episodic Markov game with a leader-follower structure. Our algorithms are based on (i) learning the quantal response model via maximum likelihood estimation and (ii) model-free or model-based RL for solving the leader's decision making problem.
arXiv Detail & Related papers (2023-07-26T10:24:17Z)
Short-Term Stock Price Forecasting using exogenous variables and Machine Learning Algorithms [3.2732602885346576]
This research paper compares four machine learning models and their accuracy in forecasting three well-known stocks traded in the NYSE from March 2020 to May 2022. We deploy, develop, and tune XGBoost, Random Forest, Multi-layer Perceptron, and Support Vector Regression models. Using a training data set of 240 trading days, we find that XGBoost gives the highest accuracy despite running longer.
arXiv Detail & Related papers (2023-05-17T07:04:32Z)
MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes. This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people. We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z)
A Deep Reinforcement Learning Trader without Offline Training [0.0]
We use Double Deep $Q$-learning in the episodic setting with Fast Learning Networks approximating the expected reward $Q$. We define the possible terminal states of an episode in such a way as to introduce a mechanism to conserve some of the money in the trading pool when market conditions are seen as unfavourable.
arXiv Detail & Related papers (2023-03-01T09:34:52Z)
Deep Reinforcement Learning Approach for Trading Automation in The Stock Market [0.0]
This paper presents a model to generate profitable trades in the stock market using Deep Reinforcement Learning (DRL) algorithms. We formulate the trading problem as a Partially Observed Markov Decision Process (POMDP) model, considering the constraints imposed by the stock market. We then solve the formulated POMDP problem using the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm reporting a 2.68 Sharpe Ratio on unseen data set.
arXiv Detail & Related papers (2022-07-05T11:34:29Z)
Comparative Study of Machine Learning Models for Stock Price Prediction [0.0]
We apply machine learning techniques to historical stock prices to forecast future prices. We quantify the results by computing the error of the predicted values versus the historical values of each stock. This method could be used to automate portfolio generation for a target return rate.
arXiv Detail & Related papers (2022-01-31T17:16:27Z)
Combining Machine Learning Classifiers for Stock Trading with Effective Feature Extraction [0.4199844472131921]
A machine learning model can make a significant profit in the US stock market by performing live trading. Our work showcased that mixtures of weighted classifiers perform better than any individual predictor of making trading decisions in the stock market.
arXiv Detail & Related papers (2021-07-28T03:22:58Z)
Learning Multiple Stock Trading Patterns with Temporal Routing Adaptor and Optimal Transport [8.617532047238461]
We propose a novel architecture, Temporal Adaptor (TRA), to empower existing stock prediction models with the ability to model multiple stock trading patterns. TRA is a lightweight module that consists of a set independent predictors for learning multiple patterns as well as a router to dispatch samples to different predictors. We show that the proposed method can improve information coefficient (IC) from 0.053 to 0.059 and 0.051 to 0.056 respectively.
arXiv Detail & Related papers (2021-06-24T12:19:45Z)
Deep Stock Predictions [58.720142291102135]
We consider the design of a trading strategy that performs portfolio optimization using Long Short Term Memory (LSTM) neural networks. We then customize the loss function used to train the LSTM to increase the profit earned. We find the LSTM model with the customized loss function to have an improved performance in the training bot over a regressive baseline such as ARIMA.
arXiv Detail & Related papers (2020-06-08T23:37:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.