Learning Multi-Agent Intention-Aware Communication for Optimal
Multi-Order Execution in Finance
- URL: http://arxiv.org/abs/2307.03119v1
- Date: Thu, 6 Jul 2023 16:45:40 GMT
- Title: Learning Multi-Agent Intention-Aware Communication for Optimal
Multi-Order Execution in Finance
- Authors: Yuchen Fang, Zhenggang Tang, Kan Ren, Weiqing Liu, Li Zhao, Jiang
Bian, Dongsheng Li, Weinan Zhang, Yong Yu, Tie-Yan Liu
- Abstract summary: We first present a multi-agent RL (MARL) method for multi-order execution considering practical constraints.
We propose a learnable multi-round communication protocol, for the agents communicating the intended actions with each other.
Experiments on the data from two real-world markets have illustrated superior performance with significantly better collaboration effectiveness.
- Score: 96.73189436721465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Order execution is a fundamental task in quantitative finance, aiming at
finishing acquisition or liquidation for a number of trading orders of the
specific assets. Recent advance in model-free reinforcement learning (RL)
provides a data-driven solution to the order execution problem. However, the
existing works always optimize execution for an individual order, overlooking
the practice that multiple orders are specified to execute simultaneously,
resulting in suboptimality and bias. In this paper, we first present a
multi-agent RL (MARL) method for multi-order execution considering practical
constraints. Specifically, we treat every agent as an individual operator to
trade one specific order, while keeping communicating with each other and
collaborating for maximizing the overall profits. Nevertheless, the existing
MARL algorithms often incorporate communication among agents by exchanging only
the information of their partial observations, which is inefficient in
complicated financial market. To improve collaboration, we then propose a
learnable multi-round communication protocol, for the agents communicating the
intended actions with each other and refining accordingly. It is optimized
through a novel action value attribution method which is provably consistent
with the original learning objective yet more efficient. The experiments on the
data from two real-world markets have illustrated superior performance with
significantly better collaboration effectiveness achieved by our method.
Related papers
- Learning to Use Tools via Cooperative and Interactive Agents [58.77710337157665]
Tool learning empowers large language models (LLMs) as agents to use external tools and extend their utility.
We propose ConAgents, a Cooperative and interactive Agents framework, which coordinates three specialized agents for tool selection, tool execution, and action calibration separately.
Our experiments on three datasets show that the LLMs, when equipped with ConAgents, outperform baselines with substantial improvement.
arXiv Detail & Related papers (2024-03-05T15:08:16Z) - Optimal Execution Using Reinforcement Learning [6.905391624417593]
This work is about optimal order execution, where a large order is split into several small orders to maximize the implementation shortfall.
Based on the diversity of cryptocurrency exchanges, we attempt to extract cross-exchange signals by aligning data from multiple exchanges for the first time.
arXiv Detail & Related papers (2023-06-19T07:09:59Z) - Many learning agents interacting with an agent-based market model [0.0]
We consider the dynamics of learning optimal execution trading agents interacting with a reactive Agent-Based Model.
The model represents a market ecology with 3-trophic levels represented by: optimal execution learning agents, minimally intelligent liquidity takers, and fast electronic liquidity providers.
We examine whether the inclusion of optimal execution agents that can learn is able to produce dynamics with the same complexity as empirical data.
arXiv Detail & Related papers (2023-03-13T18:15:52Z) - Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning [34.856522993714535]
We propose Shapley Counterfactual Credit Assignment, a novel method for explicit credit assignment which accounts for the coalition of agents.
Our method outperforms existing cooperative MARL algorithms significantly and achieves the state-of-the-art, with especially large margins on tasks with more severe difficulties.
arXiv Detail & Related papers (2021-06-01T07:38:34Z) - Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution.
We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z) - Multi-agent Policy Optimization with Approximatively Synchronous
Advantage Estimation [55.96893934962757]
In multi-agent system, polices of different agents need to be evaluated jointly.
In current methods, value functions or advantage functions use counter-factual joint actions which are evaluated asynchronously.
In this work, we propose the approximatively synchronous advantage estimation.
arXiv Detail & Related papers (2020-12-07T07:29:19Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z) - Dynamic Multi-Robot Task Allocation under Uncertainty and Temporal
Constraints [52.58352707495122]
We present a multi-robot allocation algorithm that decouples the key computational challenges of sequential decision-making under uncertainty and multi-agent coordination.
We validate our results over a wide range of simulations on two distinct domains: multi-arm conveyor belt pick-and-place and multi-drone delivery dispatch in a city.
arXiv Detail & Related papers (2020-05-27T01:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.