Related papers: Optimal Online Bookmaking for Binary Games

Related papers

Optimal Online Bookmaking for Any Number of Outcomes [6.0158981171030685]
We study the Online Bookmaking problem, where a bookmaker dynamically updates betting odds on the possible outcomes of an event.<n>We show that for any event and any number of betting rounds, the bookmaker's optimal loss is the largest root of a simple.<n>Our solution shows that bookmakers can be as fair as desired while avoiding financial risk, and the explicit characterization reveals an intriguing relation between the bookmaker's regret and Hermites.
arXiv Detail & Related papers (2025-06-19T12:11:58Z)
No-Regret Learning Under Adversarial Resource Constraints: A Spending Plan Is All You Need! [56.80767500991973]
We focus on two canonical settings: $(i)$ online resource allocation where rewards and costs are observed before action selection, and $(ii)$ online learning with resource constraints where they are observed after action selection, under full feedback or bandit feedback.<n>It is well known that achieving sublinear regret in these settings is impossible when reward and cost distributions may change arbitrarily over time.<n>We design general (primal-)dual methods that achieve sublinear regret with respect to baselines that follow the spending plan. Crucially, the performance of our algorithms improves when the spending plan ensures a well-balanced distribution of the budget
arXiv Detail & Related papers (2025-06-16T08:42:31Z)
Machine learning for sports betting: should model selection be based on accuracy or calibration? [0.0]
We train models on NBA data over several seasons and run betting experiments on a single season. We show that using calibration, rather than accuracy, as the basis for model selection leads to greater returns.
arXiv Detail & Related papers (2023-03-10T16:22:38Z)
Autobidders with Budget and ROI Constraints: Efficiency, Regret, and Pacing Dynamics [53.62091043347035]
We study a game between autobidding algorithms that compete in an online advertising platform.<n>We propose a gradient-based learning algorithm that is guaranteed to satisfy all constraints and achieves vanishing individual regret.
arXiv Detail & Related papers (2023-01-30T21:59:30Z)
Online Optimization with Untrusted Predictions [7.895232155155041]
We study the problem of online optimization, where a decision maker must choose points in a general metric space to the sum of per-round, non-competitive hitting costs and the costs of switching between rounds. We propose a novel algorithm, Adaptive Online Switching (AOS), and prove that, for any desired $delta 0)$competitive if predictions are perfect, it is $tildecalO(alphadelta)$ even when predictions are inaccurate.
arXiv Detail & Related papers (2022-02-07T21:08:02Z)
No-Regret Learning in Time-Varying Zero-Sum Games [99.86860277006318]
Learning from repeated play in a fixed zero-sum game is a classic problem in game theory and online learning. We develop a single parameter-free algorithm that simultaneously enjoys favorable guarantees under three performance measures. Our algorithm is based on a two-layer structure with a meta-algorithm learning over a group of black-box base-learners satisfying a certain property.
arXiv Detail & Related papers (2022-01-30T06:10:04Z)
Bandit Linear Optimization for Sequential Decision Making and Extensive-Form Games [102.23975166536326]
Tree-form sequential decision making (TFSDM) extends classical one-shot decision making by modeling tree-form interactions between an agent and a potentially adversarial environment. It captures the online decision-making problems that each player faces in an extensive-form game, as well as Markov decision processes and partially-observable Markov decision processes where the agent conditions on observed history. In this paper, we give the first algorithm for the bandit linear optimization problem for dilatedDM that offers both (i) linear-time losses and (ii) $O(sqrtT)$ cumulative regret in
arXiv Detail & Related papers (2021-03-08T05:00:13Z)
Online Markov Decision Processes with Aggregate Bandit Feedback [74.85532145498742]
We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics. In each episode, the learner suffers the loss accumulated along the trajectory realized by the policy chosen for the episode, and observes aggregate bandit feedback. Our main result is a computationally efficient algorithm with $O(sqrtK)$ regret for this setting, where $K$ is the number of episodes.
arXiv Detail & Related papers (2021-01-31T16:49:07Z)
Online Prediction With History-Dependent Experts: The General Case [1.52292571922932]
We study the problem of prediction of binary sequences with expert advice in the online setting, which is a classic example of online machine learning. We interpret the binary sequence as the price history of a stock, and view the predictor as the investor, which converts the problem into a stock prediction problem.
arXiv Detail & Related papers (2020-07-31T19:40:20Z)
Accumulator Bet Selection Through Stochastic Diffusion Search [0.0]
An accumulator is a bet that combines multiple bets into a wager that can generate a total payout given by the multiplication of the individual odds of its parts. The complexity of selecting a set of matches to place an accumulator bet has dramatically increased with the easier access to online and offline bookmakers. We propose a binary optimization model for the problem of selecting the most promising combinations of matches, in terms of their potential payout and probability of a win.
arXiv Detail & Related papers (2020-04-18T12:42:23Z)
Optimal strategies in the Fighting Fantasy gaming system: influencing stochastic dynamics by gambling with limited resource [0.0]
Fighting Fantasy is a popular recreational fantasy gaming system worldwide. Each round, a limited resource (luck') may be spent on a gamble to amplify the benefit from a win or mitigate the deficit from a loss. We use a backward induction approach to solve the Bellman equation for the system and identify the optimal strategy for any given state during the game.
arXiv Detail & Related papers (2020-02-24T11:31:25Z)
From Poincar\'e Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization [49.368421783733815]
We show how adapting the reward can give strong convergence guarantees in monotone games. We also show how this reward adaptation technique can be leveraged to build algorithms that converge exactly to the Nash equilibrium.
arXiv Detail & Related papers (2020-02-19T21:36:58Z)
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium [116.56359444619441]
We develop provably efficient reinforcement learning algorithms for two-player zero-sum finite-horizon Markov games. In the offline setting, we control both players and aim to find the Nash Equilibrium by minimizing the duality gap. In the online setting, we control a single player playing against an arbitrary opponent and aim to minimize the regret.
arXiv Detail & Related papers (2020-02-17T17:04:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.