HALO: Hindsight-Augmented Learning for Online Auto-Bidding
- URL: http://arxiv.org/abs/2508.03267v2
- Date: Wed, 06 Aug 2025 12:13:21 GMT
- Title: HALO: Hindsight-Augmented Learning for Online Auto-Bidding
- Authors: Pusen Dong, Chenglong Cao, Xinyu Zhou, Jirong You, Linhe Xu, Feifan Xu, Shuo Yuan,
- Abstract summary: Digital advertising platforms operate millisecond-level auctions through Real-Time Bidding (RTB) systems.<n>This dynamic mechanism enables precise audience targeting but introduces profound operational complexity.<n>We propose HALO: Hindsight-Augmented Learning for Online Auto-Bidding.
- Score: 2.9058410231275014
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Digital advertising platforms operate millisecond-level auctions through Real-Time Bidding (RTB) systems, where advertisers compete for ad impressions through algorithmic bids. This dynamic mechanism enables precise audience targeting but introduces profound operational complexity due to advertiser heterogeneity: budgets and ROI targets span orders of magnitude across advertisers, from individual merchants to multinational brands. This diversity creates a demanding adaptation landscape for Multi-Constraint Bidding (MCB). Traditional auto-bidding solutions fail in this environment due to two critical flaws: 1) severe sample inefficiency, where failed explorations under specific constraints yield no transferable knowledge for new budget-ROI combinations, and 2) limited generalization under constraint shifts, as they ignore physical relationships between constraints and bidding coefficients. To address this, we propose HALO: Hindsight-Augmented Learning for Online Auto-Bidding. HALO introduces a theoretically grounded hindsight mechanism that repurposes all explorations into training data for arbitrary constraint configuration via trajectory reorientation. Further, it employs B-spline functional representation, enabling continuous, derivative-aware bid mapping across constraint spaces. HALO ensures robust adaptation even when budget/ROI requirements differ drastically from training scenarios. Industrial dataset evaluations demonstrate the superiority of HALO in handling multi-scale constraints, reducing constraint violations while improving GMV.
Related papers
- Generative Large-Scale Pre-trained Models for Automated Ad Bidding Optimization [5.460538555236247]
We propose GRAD (Generative Reward-driven Ad-bidding with Mixture-of-Experts), a scalable foundation model for auto-bidding.<n>We show that GRAD significantly enhances platform revenue, highlighting its effectiveness in addressing the evolving and diverse requirements of modern advertisers.
arXiv Detail & Related papers (2025-08-04T02:46:18Z) - Multi-task Offline Reinforcement Learning for Online Advertising in Recommender Systems [54.709976343045824]
Current offline reinforcement learning (RL) methods face substantial challenges when applied to sparse advertising scenarios.<n>We propose MTORL, a novel multi-task offline RL model that targets two key objectives.<n>We employ multi-task learning to decode actions and rewards, simultaneously addressing channel recommendation and budget allocation.
arXiv Detail & Related papers (2025-06-29T05:05:13Z) - BAT: Benchmark for Auto-bidding Task [67.56067222427946]
We present an auction benchmark encompassing the two most prevalent auction formats.<n>We implement a series of robust baselines on a novel dataset.<n>This benchmark provides a user-friendly and intuitive framework for researchers and practitioners to develop and refine innovative autobidding algorithms.
arXiv Detail & Related papers (2025-05-13T12:12:34Z) - Nash Equilibrium Constrained Auto-bidding With Bi-level Reinforcement Learning [64.2367385090879]
We propose a new formulation of the auto-bidding problem from the platform's perspective.<n>It aims to maximize the social welfare of all advertisers under the $epsilon$-NE constraint.<n>The NCB problem presents significant challenges due to its constrained bi-level structure and the typically large number of advertisers involved.
arXiv Detail & Related papers (2025-03-13T12:25:36Z) - A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - ROI Constrained Bidding via Curriculum-Guided Bayesian Reinforcement
Learning [34.82004227655201]
We specialize in ROI-Constrained Bidding in non-stationary markets.
Based on a Partially Observable Constrained Markov Decision Process, we propose the first hard barrier solution to accommodate non-monotonic constraints.
Our method exploits a parameter-free indicator-augmented reward function and develops a Curriculum-Guided Bayesian Reinforcement Learning framework.
arXiv Detail & Related papers (2022-06-10T17:30:12Z) - VFed-SSD: Towards Practical Vertical Federated Advertising [53.08038962443853]
We propose a semi-supervised split distillation framework VFed-SSD to alleviate the two limitations.
Specifically, we develop a self-supervised task MatchedPair Detection (MPD) to exploit the vertically partitioned unlabeled data.
Our framework provides an efficient federation-enhanced solution for real-time display advertising with minimal deploying cost and significant performance lift.
arXiv Detail & Related papers (2022-05-31T17:45:30Z) - Optimal Bidding Strategy without Exploration in Real-time Bidding [14.035270361462576]
maximizing utility with a budget constraint is the primary goal for advertisers in real-time bidding (RTB) systems.
Previous works ignore the losing auctions to alleviate the difficulty with censored states.
We propose a novel practical framework using the maximum entropy principle to imitate the behavior of the true distribution observed in real-time traffic.
arXiv Detail & Related papers (2020-03-31T20:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.