Related papers: Optimal Market Making by Reinforcement Learning

Optimal Market Making by Reinforcement Learning

URL: http://arxiv.org/abs/2104.04036v1
Date: Thu, 8 Apr 2021 20:13:21 GMT
Title: Optimal Market Making by Reinforcement Learning
Authors: Matias Selser, Javier Kreiner, Manuel Maurette
Abstract summary: We apply Reinforcement Learning algorithms to the classic quantitative finance Market Making problem. We find that the Deep Q-Learning algorithm manages to recover the optimal agent.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: We apply Reinforcement Learning algorithms to solve the classic quantitative finance Market Making problem, in which an agent provides liquidity to the market by placing buy and sell orders while maximizing a utility function. The optimal agent has to find a delicate balance between the price risk of her inventory and the profits obtained by capturing the bid-ask spread. We design an environment with a reward function that determines an order relation between policies equivalent to the original utility function. When comparing our agents with the optimal solution and a benchmark symmetric agent, we find that the Deep Q-Learning algorithm manages to recover the optimal agent.

Related papers

Collab: Controlled Decoding using Mixture of Agents for LLM Alignment [90.6117569025754]
Reinforcement learning from human feedback has emerged as an effective technique to align Large Language models. Controlled Decoding provides a mechanism for aligning a model at inference time without retraining. We propose a mixture of agent-based decoding strategies leveraging the existing off-the-shelf aligned LLM policies.
arXiv Detail & Related papers (2025-03-27T17:34:25Z)
Joint Pricing and Resource Allocation: An Optimal Online-Learning Approach [20.70943884841438]
We study an online learning horizon where we make joint pricing and inventory decisions to maximize the overall net profit. We develop an efficient algorithm that utilizes a "Confidence Bound" strategy over multiple OCO.
arXiv Detail & Related papers (2025-01-29T23:23:54Z)
Sequential Resource Trading Using Comparison-Based Gradient Estimation [21.23354615468778]
We explore sequential trading for resource allocation in a setting where two greedily rational agents sequentially trade resources from a finite set of categories. We present an algorithm for the offering agent to estimate the responding agent's gradient (preferences) and make offers based on previous acceptance or rejection responses. We show that, after a finite number of consecutively rejected offers, the responding agent is at a near-optimal state, or the agents' gradients are closely aligned.
arXiv Detail & Related papers (2024-08-20T20:42:41Z)
Achieving Fairness in Multi-Agent Markov Decision Processes Using Reinforcement Learning [30.605881670761853]
We propose a Reinforcement Learning approach to achieve fairness in finite-horizon episodic MDPs. We show that such an approach achieves sub-linear regret in terms of the number of episodes.
arXiv Detail & Related papers (2023-06-01T03:43:53Z)
Approaching Collateral Optimization for NISQ and Quantum-Inspired Computing [0.0]
Collateral optimization refers to the systematic allocation of financial assets to satisfy obligations or secure transactions. One of the common objectives is to minimise the cost of collateral required to mitigate the risk associated with a particular transaction or a portfolio of transactions.
arXiv Detail & Related papers (2023-05-25T18:01:04Z)
Local Optimization Achieves Global Optimality in Multi-Agent Reinforcement Learning [139.53668999720605]
We present a multi-agent PPO algorithm in which the local policy of each agent is updated similarly to vanilla PPO. We prove that with standard regularity conditions on the Markov game and problem-dependent quantities, our algorithm converges to the globally optimal policy at a sublinear rate.
arXiv Detail & Related papers (2023-05-08T16:20:03Z)
Learning to Incentivize Information Acquisition: Proper Scoring Rules Meet Principal-Agent Model [64.94131130042275]
We study the incentivized information acquisition problem, where a principal hires an agent to gather information on her behalf. We design a provably sample efficient algorithm that tailors the UCB algorithm to our model. Our algorithm features a delicate estimation procedure for the optimal profit of the principal, and a conservative correction scheme that ensures the desired agent's actions are incentivized.
arXiv Detail & Related papers (2023-03-15T13:40:16Z)
Diversifying Investments and Maximizing Sharpe Ratio: a novel QUBO formulation [0.0]
We propose a new QUBO formulation for the task described and provide the mathematical details and required assumptions. We derive results via the available QUBO solvers, as well as discussing the behaviour of Hybrid approaches to tackle large scale problems in the term.
arXiv Detail & Related papers (2023-02-23T19:15:44Z)
Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations [16.48389671789281]
We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market. By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors. We show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption.
arXiv Detail & Related papers (2022-10-13T17:06:08Z)
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets [151.03738099494765]
We study a Markov matching market involving a planner and a set of strategic agents on the two sides of the market. We propose a reinforcement learning framework that integrates optimistic value iteration with maximum weight matching. We prove that the algorithm achieves sublinear regret.
arXiv Detail & Related papers (2022-03-07T19:51:25Z)
Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes. Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z)
Universal Trading for Order Execution with Oracle Policy Distillation [99.57416828489568]
We propose a novel universal trading policy optimization framework to bridge the gap between the noisy yet imperfect market states and the optimal action sequences for order execution. We show that our framework can better guide the learning of the common policy towards practically optimal execution by an oracle teacher with perfect information.
arXiv Detail & Related papers (2021-01-28T05:52:18Z)
Learning Strategies in Decentralized Matching Markets under Uncertain Preferences [91.3755431537592]
We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori. Our approach is based on the representation of preferences in a reproducing kernel Hilbert space. We derive optimal strategies that maximize agents' expected payoffs.
arXiv Detail & Related papers (2020-10-29T03:08:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.