Adaptive Risk-Aware Bidding with Budget Constraint in Display
Advertising
- URL: http://arxiv.org/abs/2212.12533v1
- Date: Tue, 6 Dec 2022 18:50:09 GMT
- Title: Adaptive Risk-Aware Bidding with Budget Constraint in Display
Advertising
- Authors: Zhimeng Jiang, Kaixiong Zhou, Mi Zhang, Rui Chen, Xia Hu, Soo-Hyun
Choi
- Abstract summary: We propose a novel adaptive risk-aware bidding algorithm with budget constraint via reinforcement learning.
We theoretically unveil the intrinsic relation between the uncertainty and the risk tendency based on value at risk (VaR)
- Score: 47.14651340748015
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-time bidding (RTB) has become a major paradigm of display advertising.
Each ad impression generated from a user visit is auctioned in real time, where
demand-side platform (DSP) automatically provides bid price usually relying on
the ad impression value estimation and the optimal bid price determination.
However, the current bid strategy overlooks large randomness of the user
behaviors (e.g., click) and the cost uncertainty caused by the auction
competition. In this work, we explicitly factor in the uncertainty of estimated
ad impression values and model the risk preference of a DSP under a specific
state and market environment via a sequential decision process. Specifically,
we propose a novel adaptive risk-aware bidding algorithm with budget constraint
via reinforcement learning, which is the first to simultaneously consider
estimation uncertainty and the dynamic risk tendency of a DSP. We theoretically
unveil the intrinsic relation between the uncertainty and the risk tendency
based on value at risk (VaR). Consequently, we propose two instantiations to
model risk tendency, including an expert knowledge-based formulation embracing
three essential properties and an adaptive learning method based on
self-supervised reinforcement learning. We conduct extensive experiments on
public datasets and show that the proposed framework outperforms
state-of-the-art methods in practical settings.
Related papers
- Data-Adaptive Tradeoffs among Multiple Risks in Distribution-Free Prediction [55.77015419028725]
We develop methods that permit valid control of risk when threshold and tradeoff parameters are chosen adaptively.
Our methodology supports monotone and nearly-monotone risks, but otherwise makes no distributional assumptions.
arXiv Detail & Related papers (2024-03-28T17:28:06Z) - RAGIC: Risk-Aware Generative Adversarial Model for Stock Interval
Construction [4.059196561157555]
Many existing prediction approaches focus on single-point predictions, lacking the depth needed for effective decision-making.
We propose RAGIC, which introduces sequence generation for stock interval prediction to quantify uncertainty more effectively.
RAGIC's generator includes a risk module, capturing the risk perception of informed investors, and a temporal module, accounting for historical price trends and seasonality.
arXiv Detail & Related papers (2024-02-16T15:34:07Z) - Model-Based Epistemic Variance of Values for Risk-Aware Policy Optimization [59.758009422067]
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning.
We propose a new uncertainty Bellman equation (UBE) whose solution converges to the true posterior variance over values.
We introduce a general-purpose policy optimization algorithm, Q-Uncertainty Soft Actor-Critic (QU-SAC) that can be applied for either risk-seeking or risk-averse policy optimization.
arXiv Detail & Related papers (2023-12-07T15:55:58Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Online Learning under Budget and ROI Constraints via Weak Adaptivity [57.097119428915796]
Existing primal-dual algorithms for constrained online learning problems rely on two fundamental assumptions.
We show how such assumptions can be circumvented by endowing standard primal-dual templates with weakly adaptive regret minimizers.
We prove the first best-of-both-worlds no-regret guarantees which hold in absence of the two aforementioned assumptions.
arXiv Detail & Related papers (2023-02-02T16:30:33Z) - Risk-Aware Bid Optimization for Online Display Advertisement [9.255311854574915]
This research focuses on the bid optimization problem in the real-time bidding setting for online display advertisements.
We propose a risk-aware data-driven bid optimization model that maximizes the expected profit for the advertiser.
arXiv Detail & Related papers (2022-10-28T02:14:33Z) - A Risk-Sensitive Approach to Policy Optimization [21.684251937825234]
Standard deep reinforcement learning (DRL) aims to maximize expected reward, considering collected experiences equally in formulating a policy.
We propose a more direct approach whereby risk-sensitive objectives, specified in terms of the cumulative distribution function (CDF) of the distribution of full-episode rewards, are optimized.
We demonstrate that the use of moderately "pessimistic" risk profiles, which emphasize scenarios where the agent performs poorly, leads to enhanced exploration and a continual focus on addressing deficiencies.
arXiv Detail & Related papers (2022-08-19T00:55:05Z) - Deep Reinforcement Learning for Equal Risk Pricing and Hedging under
Dynamic Expectile Risk Measures [1.2891210250935146]
We show that a new off-policy deterministic actor-critic deep reinforcement learning algorithm can identify high quality time consistent hedging policies for options.
Our numerical experiments, which involve both a simple vanilla option and a more exotic basket option, confirm that the new algorithm can produce 1) in simple environments, nearly optimal hedging policies, and highly accurate prices, simultaneously for a range of maturities.
Overall, hedging strategies that actually outperform the strategies produced using static risk measures when the risk is evaluated at later points of time.
arXiv Detail & Related papers (2021-09-09T02:52:06Z) - Learning Bounds for Risk-sensitive Learning [86.50262971918276]
In risk-sensitive learning, one aims to find a hypothesis that minimizes a risk-averse (or risk-seeking) measure of loss.
We study the generalization properties of risk-sensitive learning schemes whose optimand is described via optimized certainty equivalents.
arXiv Detail & Related papers (2020-06-15T05:25:02Z) - Optimal Bidding Strategy without Exploration in Real-time Bidding [14.035270361462576]
maximizing utility with a budget constraint is the primary goal for advertisers in real-time bidding (RTB) systems.
Previous works ignore the losing auctions to alleviate the difficulty with censored states.
We propose a novel practical framework using the maximum entropy principle to imitate the behavior of the true distribution observed in real-time traffic.
arXiv Detail & Related papers (2020-03-31T20:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.