Related papers: Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL)

Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL)

URL: http://arxiv.org/abs/2412.11434v2
Date: Tue, 17 Dec 2024 08:56:42 GMT
Title: Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL)
Authors: Alberto Silvio Chiappa, Briti Gangopadhyay, Zhao Wang, Shingo Takamatsu,
Abstract summary: We propose a framework for training auto-bidding agents in multi-slot second-price auctions.<n>We exploit the insight that, after an advertisement campaign concludes, determining the optimal bids for each impression opportunity can be framed as a multiple-choice knapsack problem.<n>We propose an "oracle" algorithm that identifies a near-optimal combination of impression opportunities and advertisement slots.
Score: 9.19703820485146
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Online advertising has become one of the most successful business models of the internet era. Impression opportunities are typically allocated through real-time auctions, where advertisers bid to secure advertisement slots. Deciding the best bid for an impression opportunity is challenging, due to the stochastic nature of user behavior and the variability of advertisement traffic over time. In this work, we propose a framework for training auto-bidding agents in multi-slot second-price auctions to maximize acquisitions (e.g., clicks, conversions) while adhering to budget and cost-per-acquisition (CPA) constraints. We exploit the insight that, after an advertisement campaign concludes, determining the optimal bids for each impression opportunity can be framed as a multiple-choice knapsack problem (MCKP) with a nonlinear objective. We propose an "oracle" algorithm that identifies a near-optimal combination of impression opportunities and advertisement slots, considering both past and future advertisement traffic data. This oracle solution serves as a training target for a student network which bids having access only to real-time information, a method we term Oracle Imitation Learning (OIL). Through numerical experiments, we demonstrate that OIL achieves superior performance compared to both online and offline reinforcement learning algorithms, offering improved sample efficiency. Notably, OIL shifts the complexity of training auto-bidding agents from crafting sophisticated learning algorithms to solving a nonlinear constrained optimization problem efficiently.

Related papers

Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective [65.12150411762273]
We show that pruning random demonstrations into seemingly incoherent "gibberish" can remarkably improve performance across diverse tasks.<n>We propose a self-discover prompt optimization framework, PromptQuine, that automatically searches for the pruning strategy by itself using only low-data regimes.
arXiv Detail & Related papers (2025-06-22T07:53:07Z)
BAT: Benchmark for Auto-bidding Task [67.56067222427946]
We present an auction benchmark encompassing the two most prevalent auction formats.<n>We implement a series of robust baselines on a novel dataset.<n>This benchmark provides a user-friendly and intuitive framework for researchers and practitioners to develop and refine innovative autobidding algorithms.
arXiv Detail & Related papers (2025-05-13T12:12:34Z)
Nash Equilibrium Constrained Auto-bidding With Bi-level Reinforcement Learning [64.2367385090879]
We propose a new formulation of the auto-bidding problem from the platform's perspective. It aims to maximize the social welfare of all advertisers under the $epsilon$-NE constraint. The NCB problem presents significant challenges due to its constrained bi-level structure and the typically large number of advertisers involved.
arXiv Detail & Related papers (2025-03-13T12:25:36Z)
Procurement Auctions via Approximately Optimal Submodular Optimization [53.93943270902349]
We study procurement auctions, where an auctioneer seeks to acquire services from strategic sellers with private costs. Our goal is to design computationally efficient auctions that maximize the difference between the quality of the acquired services and the total cost of the sellers.
arXiv Detail & Related papers (2024-11-20T18:06:55Z)
Online Ad Procurement in Non-stationary Autobidding Worlds [10.871587311621974]
We introduce a primal-dual algorithm for online decision making with multi-dimension decision variables, bandit feedback and long-term uncertain constraints. We show that our algorithm achieves low regret in many worlds when procurement outcomes are generated through procedures that are adversarial, adversarially corrupted, periodic, and ergodic.
arXiv Detail & Related papers (2023-07-10T00:41:08Z)
Adversarial Constrained Bidding via Minimax Regret Optimization with Causality-Aware Reinforcement Learning [18.408964908248855]
Existing approaches on constrained bidding typically rely on i.i.d. train and test conditions. We propose a practical Minimax Regret Optimization (MiRO) approach that interleaves between a teacher finding adversarial environments for tutoring and a learner meta-learning its policy over the given distribution of environments. Our method, MiRO with Causality-aware reinforcement Learning (MiROCL), outperforms prior methods by over 30%.
arXiv Detail & Related papers (2023-06-12T13:31:58Z)
OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs. Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z)
Improving Real-Time Bidding in Online Advertising Using Markov Decision Processes and Machine Learning Techniques [0.0]
This paper proposes a novel method for real-time bidding that combines deep learning and reinforcement learning techniques. The proposed method employs a deep neural network to predict auction details and market prices and a reinforcement learning algorithm to determine the optimal bid price. The outcomes demonstrate that the proposed method is preferable regarding cost-effectiveness and precision.
arXiv Detail & Related papers (2023-05-05T14:34:20Z)
Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions. We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers. We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z)
VFed-SSD: Towards Practical Vertical Federated Advertising [53.08038962443853]
We propose a semi-supervised split distillation framework VFed-SSD to alleviate the two limitations. Specifically, we develop a self-supervised task MatchedPair Detection (MPD) to exploit the vertically partitioned unlabeled data. Our framework provides an efficient federation-enhanced solution for real-time display advertising with minimal deploying cost and significant performance lift.
arXiv Detail & Related papers (2022-05-31T17:45:30Z)
Learning to Bid Optimally and Efficiently in Adversarial First-price Auctions [40.30925727499806]
We develop the first minimax optimal online bidding algorithm that achieves an $widetildeO(sqrtT)$ regret. We test our algorithm on three real-world first-price auction datasets obtained from Verizon Media.
arXiv Detail & Related papers (2020-07-09T05:40:39Z)
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising [52.3825928886714]
We formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach.
arXiv Detail & Related papers (2020-06-29T18:50:35Z)
Online Joint Bid/Daily Budget Optimization of Internet Advertising Campaigns [115.96295568115251]
We study the problem of automating the online joint bid/daily budget optimization of pay-per-click advertising campaigns over multiple channels. For every campaign, we capture the dependency of the number of clicks on the bid and daily budget by Gaussian Processes. We design four algorithms and show that they suffer from a regret that is upper bounded with high probability as O(sqrtT) We present the results of the adoption of our algorithms in a real-world application with a daily average spent of 1,000 Euros for more than one year.
arXiv Detail & Related papers (2020-03-03T11:07:38Z)
MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding [47.555870679348416]
We propose a Multi-ecTive Actor-Critics algorithm named MoTiAC for the problem of bidding optimization with various goals. Unlike previous RL models, the proposed MoTiAC can simultaneously fulfill multi-objective tasks in complicated bidding environments.
arXiv Detail & Related papers (2020-02-18T07:16:39Z)
Online Causal Inference for Advertising in Real-Time Bidding Auctions [1.9336815376402723]
This paper proposes a new approach to perform causal inference on advertising bought through real-time bidding systems. We first show that the effects of advertising are identified by the optimal bids. We introduce an adapted Thompson sampling (TS) algorithm to solve a multi-armed bandit problem.
arXiv Detail & Related papers (2019-08-22T21:13:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.