Related papers: Quantum Reinforcement Learning Trading Agent for Sector Rotation in the Taiwan Stock Market

Quantum Reinforcement Learning Trading Agent for Sector Rotation in the Taiwan Stock Market

URL: http://arxiv.org/abs/2506.20930v1
Date: Thu, 26 Jun 2025 01:29:19 GMT
Title: Quantum Reinforcement Learning Trading Agent for Sector Rotation in the Taiwan Stock Market
Authors: Chi-Sheng Chen, Xinyu Zhang, Ya-Chuan Chen,
Abstract summary: We propose a hybrid quantum-classical reinforcement learning framework for sector rotation in the Taiwan stock market.<n>Although quantum-enhanced models consistently achieve higher training rewards, they underperform classical models in real-world investment metrics.<n>This discrepancy highlights a core challenge in applying reinforcement learning to financial domains.
Score: 7.360168388085351
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a hybrid quantum-classical reinforcement learning framework for sector rotation in the Taiwan stock market. Our system employs Proximal Policy Optimization (PPO) as the backbone algorithm and integrates both classical architectures (LSTM, Transformer) and quantum-enhanced models (QNN, QRWKV, QASA) as policy and value networks. An automated feature engineering pipeline extracts financial indicators from capital share data to ensure consistent model input across all configurations. Empirical backtesting reveals a key finding: although quantum-enhanced models consistently achieve higher training rewards, they underperform classical models in real-world investment metrics such as cumulative return and Sharpe ratio. This discrepancy highlights a core challenge in applying reinforcement learning to financial domains -- namely, the mismatch between proxy reward signals and true investment objectives. Our analysis suggests that current reward designs may incentivize overfitting to short-term volatility rather than optimizing risk-adjusted returns. This issue is compounded by the inherent expressiveness and optimization instability of quantum circuits under Noisy Intermediate-Scale Quantum (NISQ) constraints. We discuss the implications of this reward-performance gap and propose directions for future improvement, including reward shaping, model regularization, and validation-based early stopping. Our work offers a reproducible benchmark and critical insights into the practical challenges of deploying quantum reinforcement learning in real-world finance.

Related papers

Mitigating Reward Hacking in RLHF via Bayesian Non-negative Reward Modeling [49.41422138354821]
We propose a principled reward modeling framework that integrates non-negative factor analysis into the Bradley-Terry preference model.<n>BNRM represents rewards through a sparse, non-negative latent factor generative process.<n>We show that BNRM substantially mitigates reward over-optimization, improves robustness under distribution shifts, and yields more interpretable reward decompositions than strong baselines.
arXiv Detail & Related papers (2026-02-11T08:14:11Z)
Continual Quantum Architecture Search with Tensor-Train Encoding: Theory and Applications to Signal Processing [68.35481158940401]
CL-QAS is a continual quantum architecture search framework.<n>It mitigates challenges of costly encoding amplitude and forgetting in variational quantum circuits.<n>It achieves controllable robustness expressivity, sample-efficient generalization, and smooth convergence without barren plateaus.
arXiv Detail & Related papers (2026-01-10T02:36:03Z)
Trade in Minutes! Rationality-Driven Agentic System for Quantitative Financial Trading [57.28635022507172]
TiMi is a rationality-driven multi-agent system that architecturally decouples strategy development from minute-level deployment.<n>We propose a two-tier analytical paradigm from macro patterns to micro customization, layered programming design for trading bot implementation, and closed-loop optimization driven by mathematical reflection.
arXiv Detail & Related papers (2025-10-06T13:08:55Z)
Signature-Informed Transformer for Asset Allocation [9.290367832033063]
Signature-Informed Transformer is a framework that learns end-to-end allocation policies by directly optimizing a risk-aware financial objective.<n> evaluated on daily S&P 100 equity data, SIT decisively outperforms traditional and deep-learning baselines.<n>Results indicate that portfolio-aware objectives and geometry-aware inductive biases are essential for risk-aware capital allocation in machine-learning systems.
arXiv Detail & Related papers (2025-10-03T15:58:21Z)
Hybrid Quantum-Classical Neural Networks for Few-Shot Credit Risk Assessment [52.05742536403784]
This work tackles the challenge of few-shot credit risk assessment.<n>We design and implement a novel hybrid quantum-classical workflow.<n>A Quantum Neural Network (QNN) was trained via the parameter-shift rule.<n>On a real-world credit dataset of 279 samples, our QNN achieved a robust average AUC of 0.852 +/- 0.027 in simulations and yielded an impressive AUC of 0.88 in the hardware experiment.
arXiv Detail & Related papers (2025-09-17T08:36:05Z)
TensoMeta-VQC: A Tensor-Train-Guided Meta-Learning Framework for Robust and Scalable Variational Quantum Computing [60.996803677584424]
TensoMeta-VQC is a novel tensor-train (TT)-guided meta-learning framework designed to improve the robustness and scalability of VQC significantly.<n>Our framework fully delegates the generation of quantum circuit parameters to a classical TT network, effectively decoupling optimization from quantum hardware.
arXiv Detail & Related papers (2025-08-01T23:37:55Z)
Intra-Trajectory Consistency for Reward Modeling [67.84522106537274]
We develop an intra-trajectory consistency regularization to enforce that adjacent processes with higher next-token generation probability maintain more consistent rewards.<n>We show that the reward model trained with the proposed regularization induces better DPO-aligned policies and achieves better best-of-N (BON) inference-time verification results.
arXiv Detail & Related papers (2025-06-10T12:59:14Z)
Discriminative Policy Optimization for Token-Level Reward Models [55.98642069903191]
Process reward models (PRMs) provide more nuanced supervision compared to outcome reward models (ORMs)<n>Q-RM explicitly learns token-level Q-functions from preference data without relying on fine-grained annotations.<n>Reinforcement learning with Q-RM significantly enhances training efficiency, achieving convergence 12 times faster than ORM on GSM8K and 11 times faster than step-level PRM on MATH.
arXiv Detail & Related papers (2025-05-29T11:40:34Z)
End-to-End Portfolio Optimization with Quantum Annealing [0.48516757555267037]
Using hybrid quantum-classical models, the study shows combined approaches effectively handle complex optimization better than classical methods.<n> Empirical results demonstrate a portfolio increase of 200,000 Indian Rupees over the benchmark.
arXiv Detail & Related papers (2025-04-10T21:31:30Z)
HQNN-FSP: A Hybrid Classical-Quantum Neural Network for Regression-Based Financial Stock Market Prediction [3.5418331252013897]
This study explores the potential of hybrid quantum-classical approaches to assist in financial trend prediction.<n>A custom Quantum Neural Network (QNN) regressor is introduced, designed with a novel ansatz tailored for financial applications.
arXiv Detail & Related papers (2025-03-19T16:44:21Z)
Contextual Quantum Neural Networks for Stock Price Prediction [0.0]
We apply quantum machine learning (QML) to predict the stock prices of multiple assets using a contextual quantum neural network.<n>This architecture represents the first of its kind in quantum finance, offering superior predictive power and computational efficiency.<n>Our findings highlight the transformative potential of QML in financial applications, paving the way for more advanced, resource-efficient quantum algorithms.
arXiv Detail & Related papers (2025-02-26T22:39:23Z)
Benchmarking Post-Training Quantization in LLMs: Comprehensive Taxonomy, Unified Evaluation, and Comparative Analysis [89.60263788590893]
Post-training Quantization (PTQ) technique has been extensively adopted for large language models (LLMs) compression.<n>Existing algorithms focus primarily on performance, overlooking the trade-off among model size, performance, and quantization bitwidth.<n>We provide a novel benchmark for LLMs PTQ in this paper.
arXiv Detail & Related papers (2025-02-18T07:35:35Z)
LEP-QNN: Loan Eligibility Prediction Using Quantum Neural Networks [4.2435928520499635]
We propose a novel approach that employs Quantum Machine Learning (QML) for Loan Eligibility Prediction using Quantum Neural Networks (LEP-QNN)<n>Our innovative approach achieves an accuracy of 98% in predicting loan eligibility from a single, comprehensive dataset.<n>This research showcases the potential of QML in financial predictions and establishes a foundational guide for advancing QML technologies.
arXiv Detail & Related papers (2024-12-04T09:35:03Z)
BreakGPT: Leveraging Large Language Models for Predicting Asset Price Surges [55.2480439325792]
This paper introduces BreakGPT, a novel large language model (LLM) architecture adapted specifically for time series forecasting and the prediction of sharp upward movements in asset prices. We showcase BreakGPT as a promising solution for financial forecasting with minimal training and as a strong competitor for capturing both local and global temporal dependencies.
arXiv Detail & Related papers (2024-11-09T05:40:32Z)
VinePPO: Refining Credit Assignment in RL Training of LLMs [66.80143024475635]
We propose VinePPO, a straightforward approach that leverages the flexibility of language environments to compute unbiased Monte Carlo-based estimates.<n>Our method consistently outperforms PPO and other baselines across MATH and GSM8K datasets in less wall-clock time.
arXiv Detail & Related papers (2024-10-02T15:49:30Z)
QADQN: Quantum Attention Deep Q-Network for Financial Market Prediction [6.3671741591443105]
This paper introduces a Quantum Attention Deep Q-Network (QADQN) approach to address these challenges through quantum-enhanced reinforcement learning. We gauge the QADQN agent's performance on historical data from major market indices, including the S&P 500. Our empirical results demonstrate the QADQN's superior performance, achieving better risk-adjusted returns.
arXiv Detail & Related papers (2024-08-06T10:41:46Z)
Dynamic Asset Allocation with Expected Shortfall via Quantum Annealing [0.0]
We propose a hybrid quantum-classical algorithm to solve a dynamic asset allocation problem. We compare the results from D-Wave's 2000Q and Advantage quantum annealers using real-world financial data. Experiments on assets with higher correlations tend to perform better, which may help to design practical quantum applications in the near term.
arXiv Detail & Related papers (2021-12-06T17:39:43Z)
Model-Augmented Q-learning [112.86795579978802]
We propose a MFRL framework that is augmented with the components of model-based RL. Specifically, we propose to estimate not only the $Q$-values but also both the transition and the reward with a shared network. We show that the proposed scheme, called Model-augmented $Q$-learning (MQL), obtains a policy-invariant solution which is identical to the solution obtained by learning with true reward.
arXiv Detail & Related papers (2021-02-07T17:56:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.