Related papers: JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading

URL: http://arxiv.org/abs/2308.13289v1
Date: Fri, 25 Aug 2023 10:26:43 GMT
Title: JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading
Authors: Sascha Frey, Kang Li, Peer Nagy, Silvia Sapora, Chris Lu, Stefan Zohren, Jakob Foerster and Anisoara Calinescu
Abstract summary: Financial exchanges use limit order books (LOBs) to process orders and match trades. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time.
Score: 8.884142720013081
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time. The implementation of our simulator - JAX-LOB - is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs.

Related papers

JORA: JAX Tensor-Parallel LoRA Library for Retrieval Augmented Fine-Tuning [16.86356520836045]
We introduce a novel framework for PEFT-compatible fine-tuning of Llama-2 models, leveraging distributed training. Our framework uniquely utilizes JAX's just-in-time (JIT) compilation and tensor-sharding for efficient resource management. Our experiments show more than 12x improvement in runtime compared to Hugging Face/DeepSpeed implementation with four GPU while consuming less than half the VRAM per GPU.
arXiv Detail & Related papers (2024-03-17T23:02:04Z)
JaxMARL: Multi-Agent RL Environments in JAX [107.7560737385902]
We present JaxMARL, the first open-source code base that combines ease-of-use with GPU enabled efficiency. Our experiments show that per-run our JAX-based training pipeline is up to 12500x faster than existing approaches. We also introduce and benchmark SMAX, a vectorised, simplified version of the popular StarCraft Multi-Agent Challenge.
arXiv Detail & Related papers (2023-11-16T18:58:43Z)
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning [68.16998247593209]
offline reinforcement learning (RL) paradigm provides recipe to convert static behavior datasets into policies that can perform better than the policy that collected the data. In this paper, we propose an adaptive scheme for action quantization. We show that several state-of-the-art offline RL methods such as IQL, CQL, and BRAC improve in performance on benchmarks when combined with our proposed discretization scheme.
arXiv Detail & Related papers (2023-10-18T06:07:10Z)
Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research [76.93956925360638]
Waymax is a new data-driven simulator for autonomous driving in multi-agent scenes. It runs entirely on hardware accelerators such as TPUs/GPUs and supports in-graph simulation for training. We benchmark a suite of popular imitation and reinforcement learning algorithms with ablation studies on different design decisions.
arXiv Detail & Related papers (2023-10-12T20:49:15Z)
In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD [51.04126395480625]
Recent years have seen many successful applications of machine learning (ML) to facilitate fluid dynamic computations. As simulations grow, generating new training datasets for traditional offline learning creates I/O and storage bottlenecks. This work offers a solution by simplifying this coupling and enabling in situ training and inference on heterogeneous clusters.
arXiv Detail & Related papers (2023-06-22T14:07:54Z)
SWARM Parallelism: Training Large Models Can Be Surprisingly Communication-Efficient [69.61083127540776]
Deep learning applications benefit from using large models with billions of parameters. Training these models is notoriously expensive due to the need for specialized HPC clusters. We consider alternative setups for training large models: using cheap "preemptible" instances or pooling existing resources from multiple regions.
arXiv Detail & Related papers (2023-01-27T18:55:19Z)
SOLIS -- The MLOps journey from data acquisition to actionable insights [62.997667081978825]
In this paper we present a unified deployment pipeline and freedom-to-operate approach that supports all requirements while using basic cross-platform tensor framework and script language engines. This approach however does not supply the needed procedures and pipelines for the actual deployment of machine learning capabilities in real production grade systems.
arXiv Detail & Related papers (2021-12-22T14:45:37Z)
Brax -- A Differentiable Physics Engine for Large Scale Rigid Body Simulation [33.36244621210259]
We present Brax, an open source library for rigid body simulation written in JAX. We present results on a suite of tasks inspired by the existing reinforcement learning literature, but remade in our engine.
arXiv Detail & Related papers (2021-06-24T19:09:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.