Related papers: Market Making without Regret

Market Making without Regret

URL: http://arxiv.org/abs/2411.13993v1
Date: Thu, 21 Nov 2024 10:13:55 GMT
Title: Market Making without Regret
Authors: Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Luigi Foscari, Vinayak Pathak,
Abstract summary: We consider a sequential decision-making setting where, at every round $t$, a market maker posts a bid price $B_t$ and an ask price $A_t$ to an incoming trader. If the trader's valuation is lower than the bid price, or higher than the ask price, then a trade (sell or buy) occurs. We characterize the maker's regret with respect to the best fixed choice of bid and ask pairs.
Score: 15.588799679661637
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider a sequential decision-making setting where, at every round $t$, a market maker posts a bid price $B_t$ and an ask price $A_t$ to an incoming trader (the taker) with a private valuation for one unit of some asset. If the trader's valuation is lower than the bid price, or higher than the ask price, then a trade (sell or buy) occurs. If a trade happens at round $t$, then letting $M_t$ be the market price (observed only at the end of round $t$), the maker's utility is $M_t - B_t$ if the maker bought the asset, and $A_t - M_t$ if they sold it. We characterize the maker's regret with respect to the best fixed choice of bid and ask pairs under a variety of assumptions (adversarial, i.i.d., and their variants) on the sequence of market prices and valuations. Our upper bound analysis unveils an intriguing connection relating market making to first-price auctions and dynamic pricing. Our main technical contribution is a lower bound for the i.i.d. case with Lipschitz distributions and independence between prices and valuations. The difficulty in the analysis stems from the unique structure of the reward and feedback functions, allowing an algorithm to acquire information by graduating the "cost of exploration" in an arbitrary way.

Related papers

Near-optimal Regret Using Policy Optimization in Online MDPs with Aggregate Bandit Feedback [49.84060509296641]
We study online finite-horizon Markov Decision Processes with adversarially changing loss and aggregate bandit feedback (a.k.a full-bandit)<n>Under this type of feedback, the agent observes only the total loss incurred over the entire trajectory, rather than the individual losses at each intermediate step within the trajectory.<n>We introduce the first Policy Optimization algorithms for this setting.
arXiv Detail & Related papers (2025-02-06T12:03:24Z)
Mind the Gap: A Causal Perspective on Bias Amplification in Prediction & Decision-Making [58.06306331390586]
We introduce the notion of a margin complement, which measures how much a prediction score $S$ changes due to a thresholding operation. We show that under suitable causal assumptions, the influences of $X$ on the prediction score $S$ are equal to the influences of $X$ on the true outcome $Y$.
arXiv Detail & Related papers (2024-05-24T11:22:19Z)
A Contextual Online Learning Theory of Brokerage [8.049531918823758]
We study the role of contextual information in the online learning problem of brokerage between traders. We show that if the bounded density assumption is lifted, then the problem becomes unlearnable.
arXiv Detail & Related papers (2024-05-22T18:38:05Z)
Trading Volume Maximization with Online Learning [3.8059763597999012]
We investigate how the broker should behave to maximize the trading volume. We model the traders' valuations as an i.i.d. process with an unknown distribution. If only their willingness to sell or buy at the proposed price is revealed after each interaction, we provide an algorithm achieving poly-logarithmic regret.
arXiv Detail & Related papers (2024-05-21T17:26:44Z)
An Online Learning Theory of Brokerage [3.8059763597999012]
We investigate brokerage between traders from an online learning perspective. Unlike other bilateral trade problems already studied, we focus on the case where there are no designated buyer and seller roles. We show that the optimal rate degrades to $sqrtT$ in the first case, and the problem becomes unlearnable in the second.
arXiv Detail & Related papers (2023-10-18T17:01:32Z)
Contextual Dynamic Pricing with Strategic Buyers [93.97401997137564]
We study the contextual dynamic pricing problem with strategic buyers. Seller does not observe the buyer's true feature, but a manipulated feature according to buyers' strategic behavior. We propose a strategic dynamic pricing policy that incorporates the buyers' strategic behavior into the online learning to maximize the seller's cumulative revenue.
arXiv Detail & Related papers (2023-07-08T23:06:42Z)
A Reinforcement Learning Approach in Multi-Phase Second-Price Auction Design [158.0041488194202]
We study reserve price optimization in multi-phase second price auctions. From the seller's perspective, we need to efficiently explore the environment in the presence of potentially nontruthful bidders. Third, the seller's per-step revenue is unknown, nonlinear, and cannot even be directly observed from the environment.
arXiv Detail & Related papers (2022-10-19T03:49:05Z)
An $α$-regret analysis of Adversarial Bilateral Trade [10.275531964940425]
We study sequential bilateral trade where sellers and buyers valuations are completely arbitrary. We consider gain from trade which is harder to approximate than social welfare.
arXiv Detail & Related papers (2022-10-13T08:57:30Z)
Quantum computational finance: martingale asset pricing for incomplete markets [69.73491758935712]
We show that a variety of quantum techniques can be applied to the pricing problem in finance. We discuss three different methods that are distinct from previous works.
arXiv Detail & Related papers (2022-09-19T09:22:01Z)
Double Auctions with Two-sided Bandit Feedback [11.334374665364214]
Double Auction enables decentralized transfer of goods between multiple buyers and sellers. We show with confidence bound based bidding, and Average Pricing' there is an efficient price discovery among the participants. Our paper is the first to provide decentralized learning algorithms in a two-sided market where emphboth sides have uncertain preference that need to be learned.
arXiv Detail & Related papers (2022-08-13T01:03:34Z)
Price DOES Matter! Modeling Price and Interest Preferences in Session-based Recommendation [55.0391061198924]
Session-based recommendation aims to predict items that an anonymous user would like to purchase based on her short behavior sequence. It is nontrivial to incorporate price preferences for session-based recommendation. We propose a novel method Co-guided Heterogeneous Hypergraph Network (CoHHN) for session-based recommendation.
arXiv Detail & Related papers (2022-05-09T10:47:15Z)
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms [81.01917016753644]
We investigate the discounting mismatch in actor-critic algorithm implementations from a representation learning perspective. Theoretically, actor-critic algorithms usually have discounting for both actor and critic, i.e., there is a $gammat$ term in the actor update for the transition observed at time $t$ in a trajectory. Practitioners, however, usually ignore the discounting ($gammat$) for the actor while using a discounted critic.
arXiv Detail & Related papers (2020-10-02T15:51:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.