Bid Optimization using Maximum Entropy Reinforcement Learning
- URL: http://arxiv.org/abs/2110.05032v1
- Date: Mon, 11 Oct 2021 06:53:53 GMT
- Title: Bid Optimization using Maximum Entropy Reinforcement Learning
- Authors: Mengjuan Liu, Jinyu Liu, Zhengning Hu, Yuchen Ge, Xuyun Nie
- Abstract summary: This paper focuses on optimizing a single advertiser's bidding strategy using reinforcement learning (RL) in real-time bidding (RTB)
We first utilize a widely accepted linear bidding function to compute every impression's base price and optimize it by a mutable adjustment factor derived from the RTB auction environment.
Finally, the empirical study on a public dataset demonstrates that the proposed bidding strategy has superior performance compared with the baselines.
- Score: 0.3149883354098941
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-time bidding (RTB) has become a critical way of online advertising. In
RTB, an advertiser can participate in bidding ad impressions to display its
advertisements. The advertiser determines every impression's bidding price
according to its bidding strategy. Therefore, a good bidding strategy can help
advertisers improve cost efficiency. This paper focuses on optimizing a single
advertiser's bidding strategy using reinforcement learning (RL) in RTB.
Unfortunately, it is challenging to optimize the bidding strategy through RL at
the granularity of impression due to the highly dynamic nature of the RTB
environment. In this paper, we first utilize a widely accepted linear bidding
function to compute every impression's base price and optimize it by a mutable
adjustment factor derived from the RTB auction environment, to avoid optimizing
every impression's bidding price directly. Specifically, we use the maximum
entropy RL algorithm (Soft Actor-Critic) to optimize the adjustment factor
generation policy at the impression-grained level. Finally, the empirical study
on a public dataset demonstrates that the proposed bidding strategy has
superior performance compared with the baselines.
Related papers
- Rate-Optimal Policy Optimization for Linear Markov Decision Processes [65.5958446762678]
We obtain rate-optimal $widetilde O (sqrt K)$ regret where $K$ denotes the number of episodes.
Our work is the first to establish the optimal (w.r.t.$K$) rate of convergence in the setting with bandit feedback.
No algorithm with an optimal rate guarantee is currently known.
arXiv Detail & Related papers (2023-08-28T15:16:09Z) - Demystifying Advertising Campaign Bid Recommendation: A Constraint
target CPA Goal Optimization [19.857681941728597]
This paper presents a bid optimization scenario to achieve the desired cost-per-acquisition (tCPA) goals for advertisers.
We build the optimization engine to make a decision by solving the rigorously formalized constrained optimization problem.
The proposed model can naturally recommend the bid that meets the advertisers' expectations by making inference over advertisers' historical auction behaviors.
arXiv Detail & Related papers (2022-12-26T07:43:26Z) - Adaptive Risk-Aware Bidding with Budget Constraint in Display
Advertising [47.14651340748015]
We propose a novel adaptive risk-aware bidding algorithm with budget constraint via reinforcement learning.
We theoretically unveil the intrinsic relation between the uncertainty and the risk tendency based on value at risk (VaR)
arXiv Detail & Related papers (2022-12-06T18:50:09Z) - A Profit-Maximizing Strategy for Advertising on the e-Commerce Platforms [1.565361244756411]
The proposed model aims to find the optimal set of features to maximize the probability of converting targeted audiences into actual buyers.
We conduct an empirical study featuring real-world data from Tmall to show that our proposed method can effectively optimize the advertising strategy with budgetary constraints.
arXiv Detail & Related papers (2022-10-31T01:45:42Z) - Functional Optimization Reinforcement Learning for Real-Time Bidding [14.5826735379053]
Real-time bidding is the new paradigm of programmatic advertising.
Existing approaches are struggling to provide a satisfactory solution for bidding optimization.
This paper proposes a multi-agent reinforcement learning architecture for RTB with functional optimization.
arXiv Detail & Related papers (2022-06-25T06:12:17Z) - A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in
Online Advertising [53.636153252400945]
We propose a general Multi-Agent reinforcement learning framework for Auto-Bidding, namely MAAB, to learn the auto-bidding strategies.
Our approach outperforms several baseline methods in terms of social welfare and guarantees the ad platform's revenue.
arXiv Detail & Related papers (2021-06-11T08:07:14Z) - We Know What You Want: An Advertising Strategy Recommender System for
Online Advertising [26.261736843187045]
We propose a recommender system for dynamic bidding strategy recommendation on display advertising platform.
We use a neural network as the agent to predict the advertisers' demands based on their profile and historical adoption behaviors.
Online evaluations show that the system can optimize the advertisers' advertising performance.
arXiv Detail & Related papers (2021-05-25T17:06:59Z) - A novel auction system for selecting advertisements in Real-Time bidding [68.8204255655161]
Real-Time Bidding is a new Internet advertising system that has become very popular in recent years.
We propose an alternative betting system with a new approach that not only considers the economic aspect but also other relevant factors for the functioning of the advertising system.
arXiv Detail & Related papers (2020-10-22T18:36:41Z) - Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential
Advertising [52.3825928886714]
We formulate the sequential advertising strategy optimization as a dynamic knapsack problem.
We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space.
To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach.
arXiv Detail & Related papers (2020-06-29T18:50:35Z) - Provably Efficient Exploration in Policy Optimization [117.09887790160406]
This paper proposes an Optimistic variant of the Proximal Policy Optimization algorithm (OPPO)
OPPO achieves $tildeO(sqrtd2 H3 T )$ regret.
To the best of our knowledge, OPPO is the first provably efficient policy optimization algorithm that explores.
arXiv Detail & Related papers (2019-12-12T08:40:02Z) - Online Causal Inference for Advertising in Real-Time Bidding Auctions [1.9336815376402723]
This paper proposes a new approach to perform causal inference on advertising bought through real-time bidding systems.
We first show that the effects of advertising are identified by the optimal bids.
We introduce an adapted Thompson sampling (TS) algorithm to solve a multi-armed bandit problem.
arXiv Detail & Related papers (2019-08-22T21:13:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.