Related papers: FlowOE: Imitation Learning with Flow Policy from Ensemble RL Experts for Optimal Execution under Heston Volatility and Concave Market Impacts

FlowOE: Imitation Learning with Flow Policy from Ensemble RL Experts for Optimal Execution under Heston Volatility and Concave Market Impacts

URL: http://arxiv.org/abs/2506.05755v1
Date: Fri, 06 Jun 2025 05:28:22 GMT
Title: FlowOE: Imitation Learning with Flow Policy from Ensemble RL Experts for Optimal Execution under Heston Volatility and Concave Market Impacts
Authors: Yang Li, Zhi Chen,
Abstract summary: FlowOE is a novel imitation learning framework based on flow matching models.<n>FlowOE learns from a diverse set of expert traditional strategies and adaptively selects the most suitable expert behavior for prevailing market conditions.
Score: 11.523583937607622
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Optimal execution in financial markets refers to the process of strategically transacting a large volume of assets over a period to achieve the best possible outcome by balancing the trade-off between market impact costs and timing or volatility risks. Traditional optimal execution strategies, such as static Almgren-Chriss models, often prove suboptimal in dynamic financial markets. This paper propose flowOE, a novel imitation learning framework based on flow matching models, to address these limitations. FlowOE learns from a diverse set of expert traditional strategies and adaptively selects the most suitable expert behavior for prevailing market conditions. A key innovation is the incorporation of a refining loss function during the imitation process, enabling flowOE not only to mimic but also to improve upon the learned expert actions. To the best of our knowledge, this work is the first to apply flow matching models in a stochastic optimal execution problem. Empirical evaluations across various market conditions demonstrate that flowOE significantly outperforms both the specifically calibrated expert models and other traditional benchmarks, achieving higher profits with reduced risk. These results underscore the practical applicability and potential of flowOE to enhance adaptive optimal execution.

Related papers

Your Offline Policy is Not Trustworthy: Bilevel Reinforcement Learning for Sequential Portfolio Optimization [82.03139922490796]
Reinforcement learning (RL) has shown significant promise for sequential portfolio optimization tasks, such as stock trading, where the objective is to maximize cumulative returns while minimizing risks using historical data.<n>Traditional RL approaches often produce policies that merely memorize the optimal yet impractical buying and selling behaviors within the fixed dataset.<n>Our approach frames portfolio optimization as a new type of partial-offline RL problem and makes two technical contributions.
arXiv Detail & Related papers (2025-05-19T06:37:25Z)
FlowHFT: Imitation Learning via Flow Matching Policy for Optimal High-Frequency Trading under Diverse Market Conditions [10.253213044505431]
High-frequency trading (HFT) is an investing strategy that continuously monitors market states and places bid and ask orders at millisecond speeds.<n>Traditional HFT approaches fit models with historical data and assume that future market states follow similar patterns.<n>We propose FlowHFT, a novel imitation learning framework based on flow matching policy.
arXiv Detail & Related papers (2025-05-09T04:58:14Z)
OptiGrad: A Fair and more Efficient Price Elasticity Optimization via a Gradient Based Learning [7.145413681946911]
This paper presents a novel approach to optimizing profit margins in non-life insurance markets through a gradient descent-based method. It targets three key objectives: 1) maximizing profit margins, 2) ensuring conversion rates, and 3) enforcing fairness criteria such as demographic parity (DP)
arXiv Detail & Related papers (2024-04-16T04:21:59Z)
Deep Hedging with Market Impact [0.20482269513546458]
We propose a novel general market impact dynamic hedging model based on Deep Reinforcement Learning (DRL) The optimal policy obtained from the DRL model is analysed using several option hedging simulations and compared to commonly used procedures such as delta hedging.
arXiv Detail & Related papers (2024-02-20T19:08:24Z)
Mimicking Better by Matching the Approximate Action Distribution [48.95048003354255]
We introduce MAAD, a novel, sample-efficient on-policy algorithm for Imitation Learning from Observations. We show that it requires considerable fewer interactions to achieve expert performance, outperforming current state-of-the-art on-policy methods.
arXiv Detail & Related papers (2023-06-16T12:43:47Z)
HireVAE: An Online and Adaptive Factor Model Based on Hierarchical and Regime-Switch VAE [113.47287249524008]
It is still an open question to build a factor model that can conduct stock prediction in an online and adaptive setting. We propose the first deep learning based online and adaptive factor model, HireVAE, at the core of which is a hierarchical latent space that embeds the relationship between the market situation and stock-wise latent factors. Across four commonly used real stock market benchmarks, the proposed HireVAE demonstrate superior performance in terms of active returns over previous methods.
arXiv Detail & Related papers (2023-06-05T12:58:13Z)
Stochastic Methods for AUC Optimization subject to AUC-based Fairness Constraints [51.12047280149546]
A direct approach for obtaining a fair predictive model is to train the model through optimizing its prediction performance subject to fairness constraints. We formulate the training problem of a fairness-aware machine learning model as an AUC optimization problem subject to a class of AUC-based fairness constraints. We demonstrate the effectiveness of our approach on real-world data under different fairness metrics.
arXiv Detail & Related papers (2022-12-23T22:29:08Z)
Bayesian Bilinear Neural Network for Predicting the Mid-price Dynamics in Limit-Order Book Markets [84.90242084523565]
Traditional time-series econometric methods often appear incapable of capturing the true complexity of the multi-level interactions driving the price dynamics. By adopting a state-of-the-art second-order optimization algorithm, we train a Bayesian bilinear neural network with temporal attention. By addressing the use of predictive distributions to analyze errors and uncertainties associated with the estimated parameters and model forecasts, we thoroughly compare our Bayesian model with traditional ML alternatives.
arXiv Detail & Related papers (2022-03-07T18:59:54Z)
Deep Reinforcement Learning and Convex Mean-Variance Optimisation for Portfolio Management [0.0]
Reinforcement learning (RL) methods do not rely on explicit forecasts and are better suited for multi-stage decision processes. Experiments were conducted on three markets in different economies with different overall trends.
arXiv Detail & Related papers (2022-02-13T10:12:09Z)
Adaptive learning for financial markets mixing model-based and model-free RL for volatility targeting [0.0]
Model-Free Reinforcement Learning has achieved meaningful results in stable environments but, to this day, it remains problematic in regime changing environments like financial markets. We propose to combine the best of the two techniques by selecting various model-based approaches thanks to Model-Free Deep Reinforcement Learning.
arXiv Detail & Related papers (2021-04-19T19:20:22Z)
Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.