SEGB: Self-Evolved Generative Bidding with Local Autoregressive Diffusion
- URL: http://arxiv.org/abs/2602.22226v1
- Date: Wed, 31 Dec 2025 09:05:59 GMT
- Title: SEGB: Self-Evolved Generative Bidding with Local Autoregressive Diffusion
- Authors: Yulong Gao, Wan Jiang, Mingzhe Cao, Xuepu Wang, Zeyu Pan, Haonan Yang, Ye Liu, Xin Yang,
- Abstract summary: Self-Evolved Generative Bidding (SEGB) is a framework that plans proactively and refines itself entirely offline.<n>SEGB first synthesizes plausible short-horizon future states to guide each bid, providing the agent with crucial, dynamic foresight.<n>It then performs value-guided policy refinement to iteratively discover superior strategies without any external intervention.
- Score: 9.051746879211764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the realm of online advertising, automated bidding has become a pivotal tool, enabling advertisers to efficiently capture impression opportunities in real-time. Recently, generative auto-bidding has shown significant promise, offering innovative solutions for effective ad optimization. However, existing offline-trained generative policies lack the near-term foresight required for dynamic markets and usually depend on simulators or external experts for post-training improvement. To overcome these critical limitations, we propose Self-Evolved Generative Bidding (SEGB), a framework that plans proactively and refines itself entirely offline. SEGB first synthesizes plausible short-horizon future states to guide each bid, providing the agent with crucial, dynamic foresight. Crucially, it then performs value-guided policy refinement to iteratively discover superior strategies without any external intervention. This self-contained approach uniquely enables robust policy improvement from static data alone. Experiments on the AuctionNet benchmark and a large-scale A/B test validate our approach, demonstrating that SEGB significantly outperforms state-of-the-art baselines. In a large-scale online deployment, it delivered substantial business value, achieving a +10.19% increase in target cost, proving the effectiveness of our advanced planning and evolution paradigm.
Related papers
- Self-Correcting VLA: Online Action Refinement via Sparse World Imagination [55.982504915794514]
We propose Self-Correcting VLA (SC-VLA), which achieve self-improvement by intrinsically guiding action refinement through sparse imagination.<n>SC-VLA achieve state-of-the-art performance, yielding the highest task throughput with 16% fewer steps and a 9% higher success rate than the best-performing baselines.
arXiv Detail & Related papers (2026-02-25T06:58:06Z) - A Unified Multi-Task Learning Framework for Generative Auto-Bidding with Validation-Aligned Optimization [51.27959658504722]
Multi-task learning offers a principled framework to train these tasks jointly through shared representations.<n>Existing multi-task optimization strategies are primarily guided by training dynamics and often generalize poorly in volatile bidding environments.<n>We present Validation-Aligned Multi-task Optimization (VAMO), which adaptively assigns task weights based on the alignment between per-task training gradients and a held-out validation gradient.
arXiv Detail & Related papers (2025-10-09T03:59:51Z) - Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search [24.02739832976663]
Auto-bidding serves as a critical tool for advertisers to improve their performance.<n>Recent progress has demonstrated that AI-Generated Bidding (AIGB) achieves superior performance compared to typical offline reinforcement learning (RL)-based auto-bidding methods.<n>We propose AIGB-Pearl, a novel method that integrates generative planning and policy optimization.
arXiv Detail & Related papers (2025-09-19T12:30:26Z) - Expert-Guided Diffusion Planner for Auto-Bidding [8.810433582977446]
We introduce a conditional diffusion modeling approach that integrates expert trajectory guidance with a skip-step sampling strategy to improve generation efficiency.<n>The efficacy of this method has been demonstrated through comprehensive offline experiments and statistically significant outcomes in online A/B testing, yielding an 11.29% increase in conversions and a 12.36% growth in revenue relative to the baseline.
arXiv Detail & Related papers (2025-08-12T07:23:51Z) - BAT: Benchmark for Auto-bidding Task [67.56067222427946]
We present an auction benchmark encompassing the two most prevalent auction formats.<n>We implement a series of robust baselines on a novel dataset.<n>This benchmark provides a user-friendly and intuitive framework for researchers and practitioners to develop and refine innovative autobidding algorithms.
arXiv Detail & Related papers (2025-05-13T12:12:34Z) - Generative Auto-Bidding with Value-Guided Explorations [47.71346722705783]
This paper introduces a novel offline Generative Auto-bidding framework with Value-Guided Explorations (GAVE)<n> Experimental results on two offline datasets and real-world deployments demonstrate that GAVE outperforms state-of-the-art baselines in both offline evaluations and online A/B tests.
arXiv Detail & Related papers (2025-04-20T12:28:49Z) - GAS: Generative Auto-bidding with Post-training Search [26.229396732360787]
We propose a flexible and practical Generative Auto-bidding scheme using post-training Search, termed GAS, to refine a base policy model's output.<n>Experiments conducted on the real-world dataset and online A/B test on the Kuaishou advertising platform demonstrate the effectiveness of GAS.
arXiv Detail & Related papers (2024-12-22T13:47:46Z) - Offline Reinforcement Learning for Optimizing Production Bidding
Policies [1.8689461238197953]
We propose a generalizable approach to optimizing bidding policies in production environments.
We use a hybrid agent architecture that combines arbitrary base policies with deep neural networks.
We demonstrate that such an architecture achieves statistically significant performance gains in both simulated and at-scale production bidding environments.
arXiv Detail & Related papers (2023-10-13T22:14:51Z) - Structured Dynamic Pricing: Optimal Regret in a Global Shrinkage Model [50.06663781566795]
We consider a dynamic model with the consumers' preferences as well as price sensitivity varying over time.
We measure the performance of a dynamic pricing policy via regret, which is the expected revenue loss compared to a clairvoyant that knows the sequence of model parameters in advance.
Our regret analysis results not only demonstrate optimality of the proposed policy but also show that for policy planning it is essential to incorporate available structural information.
arXiv Detail & Related papers (2023-03-28T00:23:23Z) - Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential
Advertising [52.3825928886714]
We formulate the sequential advertising strategy optimization as a dynamic knapsack problem.
We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space.
To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach.
arXiv Detail & Related papers (2020-06-29T18:50:35Z) - Optimal Bidding Strategy without Exploration in Real-time Bidding [14.035270361462576]
maximizing utility with a budget constraint is the primary goal for advertisers in real-time bidding (RTB) systems.
Previous works ignore the losing auctions to alleviate the difficulty with censored states.
We propose a novel practical framework using the maximum entropy principle to imitate the behavior of the true distribution observed in real-time traffic.
arXiv Detail & Related papers (2020-03-31T20:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.