Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL)
- URL: http://arxiv.org/abs/2412.11434v2
- Date: Tue, 17 Dec 2024 08:56:42 GMT
- Title: Auto-bidding in real-time auctions via Oracle Imitation Learning (OIL)
- Authors: Alberto Silvio Chiappa, Briti Gangopadhyay, Zhao Wang, Shingo Takamatsu,
- Abstract summary: We propose a framework for training auto-bidding agents in multi-slot second-price auctions.
We exploit the insight that, after an advertisement campaign concludes, determining the optimal bids for each impression opportunity can be framed as a multiple-choice knapsack problem.
We propose an "oracle" algorithm that identifies a near-optimal combination of impression opportunities and advertisement slots.
- Score: 9.19703820485146
- License:
- Abstract: Online advertising has become one of the most successful business models of the internet era. Impression opportunities are typically allocated through real-time auctions, where advertisers bid to secure advertisement slots. Deciding the best bid for an impression opportunity is challenging, due to the stochastic nature of user behavior and the variability of advertisement traffic over time. In this work, we propose a framework for training auto-bidding agents in multi-slot second-price auctions to maximize acquisitions (e.g., clicks, conversions) while adhering to budget and cost-per-acquisition (CPA) constraints. We exploit the insight that, after an advertisement campaign concludes, determining the optimal bids for each impression opportunity can be framed as a multiple-choice knapsack problem (MCKP) with a nonlinear objective. We propose an "oracle" algorithm that identifies a near-optimal combination of impression opportunities and advertisement slots, considering both past and future advertisement traffic data. This oracle solution serves as a training target for a student network which bids having access only to real-time information, a method we term Oracle Imitation Learning (OIL). Through numerical experiments, we demonstrate that OIL achieves superior performance compared to both online and offline reinforcement learning algorithms, offering improved sample efficiency. Notably, OIL shifts the complexity of training auto-bidding agents from crafting sophisticated learning algorithms to solving a nonlinear constrained optimization problem efficiently.
Related papers
- Procurement Auctions via Approximately Optimal Submodular Optimization [53.93943270902349]
We study procurement auctions, where an auctioneer seeks to acquire services from strategic sellers with private costs.
Our goal is to design computationally efficient auctions that maximize the difference between the quality of the acquired services and the total cost of the sellers.
arXiv Detail & Related papers (2024-11-20T18:06:55Z) - A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints [54.46126953873298]
We address the problem of dynamically pricing complementary items that are sequentially displayed to customers.
Coherent pricing policies for complementary items are essential because optimizing the pricing of each item individually is ineffective.
We empirically evaluate our approach using synthetic settings randomly generated from real-world data, and compare its performance in terms of constraints violation and regret.
arXiv Detail & Related papers (2024-07-08T09:55:31Z) - Online Ad Procurement in Non-stationary Autobidding Worlds [10.871587311621974]
We introduce a primal-dual algorithm for online decision making with multi-dimension decision variables, bandit feedback and long-term uncertain constraints.
We show that our algorithm achieves low regret in many worlds when procurement outcomes are generated through procedures that are adversarial, adversarially corrupted, periodic, and ergodic.
arXiv Detail & Related papers (2023-07-10T00:41:08Z) - Adversarial Constrained Bidding via Minimax Regret Optimization with
Causality-Aware Reinforcement Learning [18.408964908248855]
Existing approaches on constrained bidding typically rely on i.i.d. train and test conditions.
We propose a practical Minimax Regret Optimization (MiRO) approach that interleaves between a teacher finding adversarial environments for tutoring and a learner meta-learning its policy over the given distribution of environments.
Our method, MiRO with Causality-aware reinforcement Learning (MiROCL), outperforms prior methods by over 30%.
arXiv Detail & Related papers (2023-06-12T13:31:58Z) - OverPrompt: Enhancing ChatGPT through Efficient In-Context Learning [49.38867353135258]
We propose OverPrompt, leveraging the in-context learning capability of LLMs to handle multiple task inputs.
Our experiments show that OverPrompt can achieve cost-efficient zero-shot classification without causing significant detriment to task performance.
arXiv Detail & Related papers (2023-05-24T10:08:04Z) - Improving Real-Time Bidding in Online Advertising Using Markov Decision
Processes and Machine Learning Techniques [0.0]
This paper proposes a novel method for real-time bidding that combines deep learning and reinforcement learning techniques.
The proposed method employs a deep neural network to predict auction details and market prices and a reinforcement learning algorithm to determine the optimal bid price.
The outcomes demonstrate that the proposed method is preferable regarding cost-effectiveness and precision.
arXiv Detail & Related papers (2023-05-05T14:34:20Z) - Learning to Bid Optimally and Efficiently in Adversarial First-price
Auctions [40.30925727499806]
We develop the first minimax optimal online bidding algorithm that achieves an $widetildeO(sqrtT)$ regret.
We test our algorithm on three real-world first-price auction datasets obtained from Verizon Media.
arXiv Detail & Related papers (2020-07-09T05:40:39Z) - Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential
Advertising [52.3825928886714]
We formulate the sequential advertising strategy optimization as a dynamic knapsack problem.
We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space.
To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach.
arXiv Detail & Related papers (2020-06-29T18:50:35Z) - Online Joint Bid/Daily Budget Optimization of Internet Advertising
Campaigns [115.96295568115251]
We study the problem of automating the online joint bid/daily budget optimization of pay-per-click advertising campaigns over multiple channels.
For every campaign, we capture the dependency of the number of clicks on the bid and daily budget by Gaussian Processes.
We design four algorithms and show that they suffer from a regret that is upper bounded with high probability as O(sqrtT)
We present the results of the adoption of our algorithms in a real-world application with a daily average spent of 1,000 Euros for more than one year.
arXiv Detail & Related papers (2020-03-03T11:07:38Z) - MoTiAC: Multi-Objective Actor-Critics for Real-Time Bidding [47.555870679348416]
We propose a Multi-ecTive Actor-Critics algorithm named MoTiAC for the problem of bidding optimization with various goals.
Unlike previous RL models, the proposed MoTiAC can simultaneously fulfill multi-objective tasks in complicated bidding environments.
arXiv Detail & Related papers (2020-02-18T07:16:39Z) - Online Causal Inference for Advertising in Real-Time Bidding Auctions [1.9336815376402723]
This paper proposes a new approach to perform causal inference on advertising bought through real-time bidding systems.
We first show that the effects of advertising are identified by the optimal bids.
We introduce an adapted Thompson sampling (TS) algorithm to solve a multi-armed bandit problem.
arXiv Detail & Related papers (2019-08-22T21:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.