A Parametric Contextual Online Learning Theory of Brokerage
- URL: http://arxiv.org/abs/2407.01566v2
- Date: Thu, 14 Aug 2025 17:53:29 GMT
- Title: A Parametric Contextual Online Learning Theory of Brokerage
- Authors: François Bachoc, Tommaso Cesari, Roberto Colomboni,
- Abstract summary: We study the role of contextual information in the online learning problem of brokerage between traders.<n>In this sequential problem, at each time step, two traders arrive with secret valuations about an asset they wish to trade.<n>The learner (a broker) suggests a trading (or brokerage) price based on contextual data about the asset and the market conditions.<n>Then, the traders reveal their willingness to buy or sell based on whether their valuations are higher or lower than the brokerage price.
- Score: 8.049531918823758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the role of contextual information in the online learning problem of brokerage between traders. In this sequential problem, at each time step, two traders arrive with secret valuations about an asset they wish to trade. The learner (a broker) suggests a trading (or brokerage) price based on contextual data about the asset and the market conditions. Then, the traders reveal their willingness to buy or sell based on whether their valuations are higher or lower than the brokerage price. A trade occurs if one of the two traders decides to buy and the other to sell, i.e., if the broker's proposed price falls between the smallest and the largest of their two valuations. We design algorithms for this problem and prove optimal theoretical regret guarantees under various standard assumptions.
Related papers
- Nonparametric Contextual Online Bilateral Trade [15.586783656868706]
We study the problem of contextual online bilateral trade.<n>The goal of the learner is to post prices to facilitate trades between the two parties.<n>We design an algorithm that leverages contextual information through a hierarchical tree construction.
arXiv Detail & Related papers (2026-02-13T13:03:30Z) - When Agents Trade: Live Multi-Market Trading Benchmark for LLM Agents [74.55061622246824]
Agent Market Arena (AMA) is the first lifelong, real-time benchmark for evaluating Large Language Model (LLM)-based trading agents.<n>AMA integrates verified trading data, expert-checked news, and diverse agent architectures within a unified trading framework.<n>It evaluates agents across GPT-4o, GPT-4.1, Claude-3.5-haiku, Claude-sonnet-4, and Gemini-2.0-flash.
arXiv Detail & Related papers (2025-10-13T17:54:09Z) - Log-Sum-Exponential Estimator for Off-Policy Evaluation and Learning [50.93804891554481]
We introduce a novel estimator based on the log-sum-exponential (LSE) operator, which outperforms traditional inverse propensity score estimators.<n>Our LSE estimator demonstrates variance reduction and robustness under heavy-tailed conditions.<n>In the off-policy learning scenario, we establish bounds on the regret -- the performance gap between our LSE estimator and the optimal policy.
arXiv Detail & Related papers (2025-06-07T17:37:10Z) - A Tight Regret Analysis of Non-Parametric Repeated Contextual Brokerage [8.049531918823758]
We study a contextual version of the repeated brokerage problem.
In each interaction, two traders with private valuations for an item seek to buy or sell based on the learner's-a broker-proposed price, which is informed by some contextual information.
The broker's goal is to maximize the traders' net utility-also known as the gain from trade-by minimizing regret compared to an oracle with perfect knowledge of traders' valuation distributions.
arXiv Detail & Related papers (2025-03-03T08:42:55Z) - Agent Trading Arena: A Study on Numerical Understanding in LLM-Based Agents [69.58565132975504]
Large language models (LLMs) have demonstrated remarkable capabilities in natural language tasks.<n>We present the Agent Trading Arena, a virtual zero-sum stock market in which LLM-based agents engage in competitive multi-agent trading.
arXiv Detail & Related papers (2025-02-25T08:41:01Z) - Market Making without Regret [15.588799679661637]
We consider a sequential decision-making setting where, at every round $t$, a market maker posts a bid price $B_t$ and an ask price $A_t$ to an incoming trader.
If the trader's valuation is lower than the bid price, or higher than the ask price, then a trade (sell or buy) occurs.
We characterize the maker's regret with respect to the best fixed choice of bid and ask pairs.
arXiv Detail & Related papers (2024-11-21T10:13:55Z) - When AI Meets Finance (StockAgent): Large Language Model-based Stock Trading in Simulated Real-world Environments [55.19252983108372]
We have developed a multi-agent AI system called StockAgent, driven by LLMs.
The StockAgent allows users to evaluate the impact of different external factors on investor trading.
It avoids the test set leakage issue present in existing trading simulation systems based on AI Agents.
arXiv Detail & Related papers (2024-07-15T06:49:30Z) - Fair Online Bilateral Trade [20.243000364933472]
We present a complete characterization of the regret for fair gain from trade when, after each interaction, the platform only learns whether each trader accepted the current price.
We conclude by providing tight regret bounds when, after each interaction, the platform is allowed to observe the true traders' valuations.
arXiv Detail & Related papers (2024-05-22T18:49:11Z) - Trading Volume Maximization with Online Learning [3.8059763597999012]
We investigate how the broker should behave to maximize the trading volume.
We model the traders' valuations as an i.i.d. process with an unknown distribution.
If only their willingness to sell or buy at the proposed price is revealed after each interaction, we provide an algorithm achieving poly-logarithmic regret.
arXiv Detail & Related papers (2024-05-21T17:26:44Z) - Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback [58.66941279460248]
Learning from human feedback plays an important role in aligning generative models, such as large language models (LLM)
We study a model within this problem domain--contextual dueling bandits with adversarial feedback, where the true preference label can be flipped by an adversary.
We propose an algorithm namely robust contextual dueling bandit (algo), which is based on uncertainty-weighted maximum likelihood estimation.
arXiv Detail & Related papers (2024-04-16T17:59:55Z) - No-Regret Learning in Bilateral Trade via Global Budget Balance [29.514323697659613]
We provide the first no-regret algorithms for adversarial bilateral trade under various feedback models.
We show that in the full-feedback model, the learner can guarantee $tilde O(sqrtT)$ regret against the best fixed prices in hindsight.
We also provide a learning algorithm guaranteeing a $tilde O(T3/4)$ regret upper bound with one-bit feedback.
arXiv Detail & Related papers (2023-10-18T22:34:32Z) - An Online Learning Theory of Brokerage [3.8059763597999012]
We investigate brokerage between traders from an online learning perspective.
Unlike other bilateral trade problems already studied, we focus on the case where there are no designated buyer and seller roles.
We show that the optimal rate degrades to $sqrtT$ in the first case, and the problem becomes unlearnable in the second.
arXiv Detail & Related papers (2023-10-18T17:01:32Z) - Online Learning in Contextual Second-Price Pay-Per-Click Auctions [47.06746975822902]
We study online learning in contextual pay-per-click auctions where at each of the $T$ rounds, the learner receives some context along with a set of ads.
The learner's goal is to minimize her regret, defined as the gap between her total revenue and that of an oracle strategy.
arXiv Detail & Related papers (2023-10-08T07:04:22Z) - Uniswap Liquidity Provision: An Online Learning Approach [49.145538162253594]
Decentralized Exchanges (DEXs) are new types of marketplaces leveraging technology.
One such DEX, Uniswap v3, allows liquidity providers to allocate funds more efficiently by specifying an active price interval for their funds.
This introduces the problem of finding an optimal strategy for choosing price intervals.
We formalize this problem as an online learning problem with non-stochastic rewards.
arXiv Detail & Related papers (2023-02-01T17:21:40Z) - A Reinforcement Learning Approach in Multi-Phase Second-Price Auction
Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions.
From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders.
Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z) - Deep Q-Learning Market Makers in a Multi-Agent Simulated Stock Market [58.720142291102135]
This paper focuses precisely on the study of these markets makers strategies from an agent-based perspective.
We propose the application of Reinforcement Learning (RL) for the creation of intelligent market markers in simulated stock markets.
arXiv Detail & Related papers (2021-12-08T14:55:21Z) - Taking Over the Stock Market: Adversarial Perturbations Against
Algorithmic Traders [47.32228513808444]
We present a realistic scenario in which an attacker influences algorithmic trading systems by using adversarial learning techniques.
We show that when added to the input stream, our perturbation can fool the trading algorithms at future unseen data points.
arXiv Detail & Related papers (2020-10-19T06:28:05Z) - Stochastic Bandits with Linear Constraints [69.757694218456]
We study a constrained contextual linear bandit setting, where the goal of the agent is to produce a sequence of policies.
We propose an upper-confidence bound algorithm for this problem, called optimistic pessimistic linear bandit (OPLB)
arXiv Detail & Related papers (2020-06-17T22:32:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.