Advertising Media and Target Audience Optimization via High-dimensional
Bandits
- URL: http://arxiv.org/abs/2209.08403v1
- Date: Sat, 17 Sep 2022 21:00:53 GMT
- Title: Advertising Media and Target Audience Optimization via High-dimensional
Bandits
- Authors: Wenjia Ba, J. Michael Harrison, Harikesh S. Nair
- Abstract summary: We present a data-driven algorithm that advertisers can use to automate their digital ad-campaigns at online publishers.
The algorithm enables the advertiser to search across available target audiences and ad-media to find the best possible combination for its campaign via online experimentation.
- Score: 2.5137859989323537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a data-driven algorithm that advertisers can use to automate their
digital ad-campaigns at online publishers. The algorithm enables the advertiser
to search across available target audiences and ad-media to find the best
possible combination for its campaign via online experimentation. The problem
of finding the best audience-ad combination is complicated by a number of
distinctive challenges, including (a) a need for active exploration to resolve
prior uncertainty and to speed the search for profitable combinations, (b) many
combinations to choose from, giving rise to high-dimensional search
formulations, and (c) very low success probabilities, typically just a fraction
of one percent. Our algorithm (designated LRDL, an acronym for Logistic
Regression with Debiased Lasso) addresses these challenges by combining four
elements: a multiarmed bandit framework for active exploration; a Lasso penalty
function to handle high dimensionality; an inbuilt debiasing kernel that
handles the regularization bias induced by the Lasso; and a semi-parametric
regression model for outcomes that promotes cross-learning across arms. The
algorithm is implemented as a Thompson Sampler, and to the best of our
knowledge, it is the first that can practically address all of the challenges
above. Simulations with real and synthetic data show the method is effective
and document its superior performance against several benchmarks from the
recent high-dimensional bandit literature.
Related papers
- Learning to Rank for Multiple Retrieval-Augmented Models through Iterative Utility Maximization [21.115495457454365]
This paper investigates the design of a unified search engine to serve multiple retrieval-augmented generation (RAG) agents.
We introduce an iterative approach where the search engine generates retrieval results for these RAG agents and gathers feedback on the quality of the retrieved documents during an offline phase.
We adapt this approach to an online setting, allowing the search engine to refine its behavior based on real-time individual agents feedback.
arXiv Detail & Related papers (2024-10-13T17:53:50Z) - LOLA: LLM-Assisted Online Learning Algorithm for Content Experiments [2.2021543101231167]
This paper introduces the LLM-Assisted Online Learning Algorithm (LOLA)
LOLA integrates Large Language Models (LLMs) with adaptive experimentation to optimize content delivery.
Our numerical experiments on Upworthy data show LOLA outperforms the standard A/B test method.
arXiv Detail & Related papers (2024-06-03T07:56:58Z) - Best Arm Identification with Fixed Budget: A Large Deviation Perspective [54.305323903582845]
We present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
In particular, we present sred, a truly adaptive algorithm that can reject arms in it any round based on the observed empirical gaps between the rewards of various arms.
arXiv Detail & Related papers (2023-12-19T13:17:43Z) - Large-scale Fully-Unsupervised Re-Identification [78.47108158030213]
We propose two strategies to learn from large-scale unlabeled data.
The first strategy performs a local neighborhood sampling to reduce the dataset size in each without violating neighborhood relationships.
A second strategy leverages a novel Re-Ranking technique, which has a lower time upper bound complexity and reduces the memory complexity from O(n2) to O(kn) with k n.
arXiv Detail & Related papers (2023-07-26T16:19:19Z) - Factorization of Multi-Agent Sampling-Based Motion Planning [72.42734061131569]
Modern robotics often involves multiple embodied agents operating within a shared environment.
Standard sampling-based algorithms can be used to search for solutions in the robots' joint space.
We integrate the concept of factorization into sampling-based algorithms, which requires only minimal modifications to existing methods.
We present a general implementation of a factorized SBA, derive an analytical gain in terms of sample complexity for PRM*, and showcase empirical results for RRG.
arXiv Detail & Related papers (2023-04-01T15:50:18Z) - Incentivizing Combinatorial Bandit Exploration [87.08827496301839]
Consider a bandit algorithm that recommends actions to self-interested users in a recommendation system.
Users are free to choose other actions and need to be incentivized to follow the algorithm's recommendations.
While the users prefer to exploit, the algorithm can incentivize them to explore by leveraging the information collected from the previous users.
arXiv Detail & Related papers (2022-06-01T13:46:25Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - Approximate Nearest Neighbor Search under Neural Similarity Metric for
Large-Scale Recommendation [20.42993976179691]
We propose a novel method to extend ANN search to arbitrary matching functions.
Our main idea is to perform a greedy walk with a matching function in a similarity graph constructed from all items.
The proposed method has been fully deployed in the Taobao display advertising platform and brings a considerable advertising revenue increase.
arXiv Detail & Related papers (2022-02-14T07:55:57Z) - MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven
Reinforcement Learning [65.52675802289775]
We show that an uncertainty aware classifier can solve challenging reinforcement learning problems.
We propose a novel method for computing the normalized maximum likelihood (NML) distribution.
We show that the resulting algorithm has a number of intriguing connections to both count-based exploration methods and prior algorithms for learning reward functions.
arXiv Detail & Related papers (2021-07-15T08:19:57Z) - Doubly-Adaptive Thompson Sampling for Multi-Armed and Contextual Bandits [28.504921333436833]
We propose a variant of a Thompson sampling based algorithm that adaptively re-weighs the terms of a doubly robust estimator on the true mean reward of each arm.
The proposed algorithm matches the optimal (minimax) regret rate and its empirical evaluation in a semi-synthetic experiment.
We extend this approach to contextual bandits, where there are more sources of bias present apart from the adaptive data collection.
arXiv Detail & Related papers (2021-02-25T22:29:25Z) - Optimizing Offer Sets in Sub-Linear Time [5.027714423258537]
We propose an algorithm for personalized offer set optimization that runs in time sub-linear in the number of items.
Our algorithm can be entirely data-driven, relying on samples of the user, where a sample' refers to the user interaction data typically collected by firms.
arXiv Detail & Related papers (2020-11-17T13:02:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.