Related papers: Functional Optimization Reinforcement Learning for Real-Time Bidding

Functional Optimization Reinforcement Learning for Real-Time Bidding

URL: http://arxiv.org/abs/2206.13939v1
Date: Sat, 25 Jun 2022 06:12:17 GMT
Title: Functional Optimization Reinforcement Learning for Real-Time Bidding
Authors: Yining Lu, Changjie Lu, Naina Bandyopadhyay, Manoj Kumar, Gaurav Gupta
Abstract summary: Real-time bidding is the new paradigm of programmatic advertising. Existing approaches are struggling to provide a satisfactory solution for bidding optimization. This paper proposes a multi-agent reinforcement learning architecture for RTB with functional optimization.
Score: 14.5826735379053
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Real-time bidding is the new paradigm of programmatic advertising. An advertiser wants to make the intelligent choice of utilizing a \textbf{Demand-Side Platform} to improve the performance of their ad campaigns. Existing approaches are struggling to provide a satisfactory solution for bidding optimization due to stochastic bidding behavior. In this paper, we proposed a multi-agent reinforcement learning architecture for RTB with functional optimization. We designed four agents bidding environment: three Lagrange-multiplier based functional optimization agents and one baseline agent (without any attribute of functional optimization) First, numerous attributes have been assigned to each agent, including biased or unbiased win probability, Lagrange multiplier, and click-through rate. In order to evaluate the proposed RTB strategy's performance, we demonstrate the results on ten sequential simulated auction campaigns. The results show that agents with functional actions and rewards had the most significant average winning rate and winning surplus, given biased and unbiased winning information respectively. The experimental evaluations show that our approach significantly improve the campaign's efficacy and profitability.

Related papers

Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection [71.92083784393418]
Inference-time methods such as Best-of-N (BON) sampling offer a simple yet effective alternative to improve performance. We propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
Procurement Auctions via Approximately Optimal Submodular Optimization [53.93943270902349]
We study procurement auctions, where an auctioneer seeks to acquire services from strategic sellers with private costs. Our goal is to design computationally efficient auctions that maximize the difference between the quality of the acquired services and the total cost of the sellers.
arXiv Detail & Related papers (2024-11-20T18:06:55Z)
Fair Allocation in Dynamic Mechanism Design [57.66441610380448]
We consider a problem where an auctioneer sells an indivisible good to groups of buyers in every round, for a total of $T$ rounds. The auctioneer aims to maximize their discounted overall revenue while adhering to a fairness constraint that guarantees a minimum average allocation for each group.
arXiv Detail & Related papers (2024-05-31T19:26:05Z)
Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process. We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals. The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z)
Maximizing the Success Probability of Policy Allocations in Online Systems [5.485872703839928]
In this paper we consider the problem at the level of user timelines instead of individual bid requests. In order to optimally allocate policies to users, typical multiple treatments allocation methods solve knapsack-like problems. We introduce the SuccessProMax algorithm that aims at finding the policy allocation which is the most likely to outperform a fixed policy.
arXiv Detail & Related papers (2023-12-26T10:55:33Z)
DeepHive: A multi-agent reinforcement learning approach for automated discovery of swarm-based optimization policies [0.0]
The state of each agent within the swarm is defined as its current position and function value within a design space. The proposed approach is tested on various benchmark optimization functions and compared to the performance of other global optimization strategies.
arXiv Detail & Related papers (2023-03-29T18:08:08Z)
Non-Myopic Multifidelity Bayesian Optimization [0.0]
This paper proposes a non-myopic multifidelity Bayesian framework to grasp the long-term reward from future steps of the optimization. We demonstrate that the proposed algorithm outperforms a standard multifidelity Bayesian framework on popular benchmark optimization problems.
arXiv Detail & Related papers (2022-07-13T16:25:35Z)
A Unified Framework for Campaign Performance Forecasting in Online Display Advertising [9.005665883444902]
Interpretable and accurate results could enable advertisers to manage and optimize their campaign criteria. New framework reproduces campaign performance on historical logs under various bidding types with a unified replay algorithm. Method captures mixture calibration patterns among related forecast indicators to map the estimated results to the true ones.
arXiv Detail & Related papers (2022-02-24T03:04:29Z)
Bid Optimization using Maximum Entropy Reinforcement Learning [0.3149883354098941]
This paper focuses on optimizing a single advertiser's bidding strategy using reinforcement learning (RL) in real-time bidding (RTB) We first utilize a widely accepted linear bidding function to compute every impression's base price and optimize it by a mutable adjustment factor derived from the RTB auction environment. Finally, the empirical study on a public dataset demonstrates that the proposed bidding strategy has superior performance compared with the baselines.
arXiv Detail & Related papers (2021-10-11T06:53:53Z)
A Cooperative-Competitive Multi-Agent Framework for Auto-bidding in Online Advertising [53.636153252400945]
We propose a general Multi-Agent reinforcement learning framework for Auto-Bidding, namely MAAB, to learn the auto-bidding strategies. Our approach outperforms several baseline methods in terms of social welfare and guarantees the ad platform's revenue.
arXiv Detail & Related papers (2021-06-11T08:07:14Z)
Are we Forgetting about Compositional Optimisers in Bayesian Optimisation? [66.39551991177542]
This paper presents a sample methodology for global optimisation. Within this, a crucial performance-determiningtrivial is maximising the acquisition function. We highlight the empirical advantages of the approach to optimise functionation across 3958 individual experiments.
arXiv Detail & Related papers (2020-12-15T12:18:38Z)
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising [52.3825928886714]
We formulate the sequential advertising strategy optimization as a dynamic knapsack problem. We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space. To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach.
arXiv Detail & Related papers (2020-06-29T18:50:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.