Related papers: LOB-Bench: Benchmarking Generative AI for Finance

LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data

URL: http://arxiv.org/abs/2502.09172v1
Date: Thu, 13 Feb 2025 10:56:58 GMT
Title: LOB-Bench: Benchmarking Generative AI for Finance - an Application to Limit Order Book Data
Authors: Peer Nagy, Sascha Frey, Kang Li, Bidipta Sarkar, Svitlana Vyetrenko, Stefan Zohren, Ani Calinescu, Jakob Foerster,
Abstract summary: We present a benchmark designed to evaluate the quality and realism of generative message-by-order data for limit order books (LOB)<n>Our framework measures distributional differences in conditional and unconditional statistics between generated and real LOB data.<n>The benchmark also includes features commonly used LOB statistics such as spread, order book volumes, order imbalance, and message inter-arrival times.
Score: 7.317765812144531
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While financial data presents one of the most challenging and interesting sequence modelling tasks due to high noise, heavy tails, and strategic interactions, progress in this area has been hindered by the lack of consensus on quantitative evaluation paradigms. To address this, we present LOB-Bench, a benchmark, implemented in python, designed to evaluate the quality and realism of generative message-by-order data for limit order books (LOB) in the LOBSTER format. Our framework measures distributional differences in conditional and unconditional statistics between generated and real LOB data, supporting flexible multivariate statistical evaluation. The benchmark also includes features commonly used LOB statistics such as spread, order book volumes, order imbalance, and message inter-arrival times, along with scores from a trained discriminator network. Lastly, LOB-Bench contains "market impact metrics", i.e. the cross-correlations and price response functions for specific events in the data. We benchmark generative autoregressive state-space models, a (C)GAN, as well as a parametric LOB model and find that the autoregressive GenAI approach beats traditional model classes.

Related papers

Representation Learning of Limit Order Book: A Comprehensive Study and Benchmarking [3.94375691568608]
Limit Order Book (LOB) provides a fine-grained view of market dynamics.<n>Existing approaches often tightly couple representation learning with specific downstream tasks in an end-to-end manner.<n>We introduce LOBench, a standardized benchmark with real China A-share market data, offering curated datasets, unified preprocessing, consistent evaluation metrics, and strong baselines.
arXiv Detail & Related papers (2025-05-04T15:00:00Z)
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs [60.881609323604685]
Large Language Models (LLMs) accessed via black-box APIs introduce a trust challenge. Users pay for services based on advertised model capabilities. providers may covertly substitute the specified model with a cheaper, lower-quality alternative to reduce operational costs. This lack of transparency undermines fairness, erodes trust, and complicates reliable benchmarking.
arXiv Detail & Related papers (2025-04-07T03:57:41Z)
A Simple Baseline for Predicting Events with Auto-Regressive Tabular Transformers [70.20477771578824]
Existing approaches to event prediction include time-aware positional embeddings, learned row and field encodings, and oversampling methods for addressing class imbalance. We propose a simple but flexible baseline using standard autoregressive LLM-style transformers with elementary positional embeddings and a causal language modeling objective. Our baseline outperforms existing approaches across popular datasets and can be employed for various use-cases.
arXiv Detail & Related papers (2024-10-14T15:59:16Z)
EBES: Easy Benchmarking for Event Sequences [17.277513178760348]
Event sequences are common data structures in various real-world domains such as healthcare, finance, and user interaction logs. Despite advances in temporal data modeling techniques, there is no standardized benchmarks for evaluating their performance on event sequences. We introduce EBES, a comprehensive benchmarking tool with standardized evaluation scenarios and protocols.
arXiv Detail & Related papers (2024-10-04T13:03:43Z)
Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network [7.54290390842336]
We propose an end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages. Using NASDAQ equity LOBs, we develop a custom tokenizer for message data, converting groups of successive digits to tokens. Results show promising performance in approximating the data distribution, as evidenced by low model perplexity.
arXiv Detail & Related papers (2023-08-23T09:37:22Z)
Bring Your Own Data! Self-Supervised Evaluation for Large Language Models [52.15056231665816]
We propose a framework for self-supervised evaluation of Large Language Models (LLMs) We demonstrate self-supervised evaluation strategies for measuring closed-book knowledge, toxicity, and long-range context dependence. We find strong correlations between self-supervised and human-supervised evaluations.
arXiv Detail & Related papers (2023-06-23T17:59:09Z)
GREAT Score: Global Robustness Evaluation of Adversarial Perturbation using Generative Models [60.48306899271866]
We present a new framework, called GREAT Score, for global robustness evaluation of adversarial perturbation using generative models. We show high correlation and significantly reduced cost of GREAT Score when compared to the attack-based model ranking on RobustBench. GREAT Score can be used for remote auditing of privacy-sensitive black-box models.
arXiv Detail & Related papers (2023-04-19T14:58:27Z)
Neural Stochastic Agent-Based Limit Order Book Simulation: A Hybrid Methodology [6.09170287691728]
Modern financial exchanges use an electronic limit order book (LOB) to store bid and ask orders for a specific financial asset. We propose a novel hybrid LOB simulation paradigm characterised by: (1) representing the aggregation of market events' logic by a neural background trader that is pre-trained on historical LOB data through a neural point model; and (2) embedding the background trader in a multi-agent simulation with other trading agents. We show that the stylised facts remain and we demonstrate order flow impact and financial herding behaviours that are in accordance with empirical observations of real markets.
arXiv Detail & Related papers (2023-02-28T20:53:39Z)
DSLOB: A Synthetic Limit Order Book Dataset for Benchmarking Forecasting Algorithms under Distributional Shift [16.326002979578686]
In electronic trading markets, limit order books (LOBs) provide information about pending buy/sell orders at various price levels for a given security. Recently, there has been a growing interest in using LOB data for resolving downstream machine learning tasks.
arXiv Detail & Related papers (2022-11-17T06:33:27Z)
Leveraging Instance Features for Label Aggregation in Programmatic Weak Supervision [75.1860418333995]
Programmatic Weak Supervision (PWS) has emerged as a widespread paradigm to synthesize training labels efficiently. The core component of PWS is the label model, which infers true labels by aggregating the outputs of multiple noisy supervision sources as labeling functions. Existing statistical label models typically rely only on the outputs of LF, ignoring the instance features when modeling the underlying generative process.
arXiv Detail & Related papers (2022-10-06T07:28:53Z)
The Limit Order Book Recreation Model (LOBRM): An Extended Analysis [2.0305676256390934]
The microstructure order book (LOB) depicts the fine-ahead-ahead demand and supply relationship for financial assets. LOBRM was recently proposed to bridge this gap by synthesizing the LOB from trades and quotes (TAQ) data. We extend the research on LOBRM and further validate its use in real-world application scenarios.
arXiv Detail & Related papers (2021-07-01T15:25:21Z)
The LOB Recreation Model: Predicting the Limit Order Book from TAQ History Using an Ordinary Differential Equation Recurrent Neural Network [9.686252465354274]
We present the LOB recreation model, a first attempt from a deep learning perspective to recreate the top five price levels of the public limit order book (LOB) for small-tick stocks. By the paradigm of transfer learning, the source model trained on one stock can be fine-tuned to enable application to other financial assets of the same class.
arXiv Detail & Related papers (2021-03-02T12:07:43Z)
Feature Quantization Improves GAN Training [126.02828112121874]
Feature Quantization (FQ) for the discriminator embeds both true and fake data samples into a shared discrete space. Our method can be easily plugged into existing GAN models, with little computational overhead in training.
arXiv Detail & Related papers (2020-04-05T04:06:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.