Optimal Execution Using Reinforcement Learning
- URL: http://arxiv.org/abs/2306.17178v1
- Date: Mon, 19 Jun 2023 07:09:59 GMT
- Title: Optimal Execution Using Reinforcement Learning
- Authors: Cong Zheng and Jiafa He and Can Yang
- Abstract summary: This work is about optimal order execution, where a large order is split into several small orders to maximize the implementation shortfall.
Based on the diversity of cryptocurrency exchanges, we attempt to extract cross-exchange signals by aligning data from multiple exchanges for the first time.
- Score: 6.905391624417593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work is about optimal order execution, where a large order is split into
several small orders to maximize the implementation shortfall. Based on the
diversity of cryptocurrency exchanges, we attempt to extract cross-exchange
signals by aligning data from multiple exchanges for the first time. Unlike
most previous studies that focused on using single-exchange information, we
discuss the impact of cross-exchange signals on the agent's decision-making in
the optimal execution problem. Experimental results show that cross-exchange
signals can provide additional information for the optimal execution of
cryptocurrency to facilitate the optimal execution process.
Related papers
- Training Greedy Policy for Proposal Batch Selection in Expensive Multi-Objective Combinatorial Optimization [52.80408805368928]
We introduce a novel greedy-style subset selection algorithm for batch acquisition.
Our experiments on the red fluorescent proteins show that our proposed method achieves the baseline performance in 1.69x fewer queries.
arXiv Detail & Related papers (2024-06-21T05:57:08Z) - Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs.
We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention.
Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z) - Binary Classifier Optimization for Large Language Model Alignment [4.61411484523337]
We present theoretical foundations to explain the successful alignment achieved through binary signals.
We identify two techniques for effective alignment: reward shift and underlying distribution matching.
Our model consistently demonstrates effective and robust alignment across two base LLMs and three different binary signal datasets.
arXiv Detail & Related papers (2024-04-06T15:20:59Z) - Data-Efficient Interactive Multi-Objective Optimization Using ParEGO [6.042269506496206]
Multi-objective optimization seeks to identify a set of non-dominated solutions that provide optimal trade-offs among competing objectives.
In practical applications, decision-makers (DMs) will select a single solution that aligns with their preferences to be implemented.
We propose two novel algorithms that efficiently locate the most preferred region of the Pareto front in expensive-to-evaluate problems.
arXiv Detail & Related papers (2024-01-12T15:55:51Z) - Delegating Data Collection in Decentralized Machine Learning [67.0537668772372]
Motivated by the emergence of decentralized machine learning (ML) ecosystems, we study the delegation of data collection.
We design optimal and near-optimal contracts that deal with two fundamental information asymmetries.
We show that a principal can cope with such asymmetry via simple linear contracts that achieve 1-1/e fraction of the optimal utility.
arXiv Detail & Related papers (2023-09-04T22:16:35Z) - Learning Multi-Agent Intention-Aware Communication for Optimal
Multi-Order Execution in Finance [96.73189436721465]
We first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints.
We propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other.
Experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness.
arXiv Detail & Related papers (2023-07-06T16:45:40Z) - Learning Proximal Operators to Discover Multiple Optima [66.98045013486794]
We present an end-to-end method to learn the proximal operator across non-family problems.
We show that for weakly-ized objectives and under mild conditions, the method converges globally.
arXiv Detail & Related papers (2022-01-28T05:53:28Z) - Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution.
We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z) - Extrapolation-based Prediction-Correction Methods for Time-varying
Convex Optimization [5.768816587293478]
We discuss algorithms for online optimization based on the prediction-correction paradigm.
We propose a novel and tailored prediction strategy, which we call extrapolation-based.
We discuss the empirical performance of the algorithm when applied to signal processing, machine learning, and robotics problems.
arXiv Detail & Related papers (2020-04-24T12:48:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.