Ad-load Balancing via Off-policy Learning in a Content Marketplace
- URL: http://arxiv.org/abs/2309.11518v2
- Date: Tue, 19 Dec 2023 07:40:45 GMT
- Title: Ad-load Balancing via Off-policy Learning in a Content Marketplace
- Authors: Hitesh Sagtani, Madan Jhawar, Rishabh Mehrotra, Olivier Jeunen
- Abstract summary: Ad-load balancing is a critical challenge in online advertising systems, particularly in the context of social media platforms.
Traditional approaches to ad-load balancing rely on static allocation policies, which fail to adapt to changing user preferences and contextual factors.
We present an approach that leverages off-policy learning and evaluation from logged bandit feedback.
- Score: 9.783697404304025
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Ad-load balancing is a critical challenge in online advertising systems,
particularly in the context of social media platforms, where the goal is to
maximize user engagement and revenue while maintaining a satisfactory user
experience. This requires the optimization of conflicting objectives, such as
user satisfaction and ads revenue. Traditional approaches to ad-load balancing
rely on static allocation policies, which fail to adapt to changing user
preferences and contextual factors. In this paper, we present an approach that
leverages off-policy learning and evaluation from logged bandit feedback. We
start by presenting a motivating analysis of the ad-load balancing problem,
highlighting the conflicting objectives between user satisfaction and ads
revenue. We emphasize the nuances that arise due to user heterogeneity and the
dependence on the user's position within a session. Based on this analysis, we
define the problem as determining the optimal ad-load for a particular feed
fetch. To tackle this problem, we propose an off-policy learning framework that
leverages unbiased estimators such as Inverse Propensity Scoring (IPS) and
Doubly Robust (DR) to learn and estimate the policy values using offline
collected stochastic data. We present insights from online A/B experiments
deployed at scale across over 80 million users generating over 200 million
sessions, where we find statistically significant improvements in both user
satisfaction metrics and ads revenue for the platform.
Related papers
- Unveiling User Satisfaction and Creator Productivity Trade-Offs in Recommendation Platforms [68.51708490104687]
We show that a purely relevance-driven policy with low exploration strength boosts short-term user satisfaction but undermines the long-term richness of the content pool.
Our findings reveal a fundamental trade-off between immediate user satisfaction and overall content production on platforms.
arXiv Detail & Related papers (2024-10-31T07:19:22Z) - MisinfoEval: Generative AI in the Era of "Alternative Facts" [50.069577397751175]
We introduce a framework for generating and evaluating large language model (LLM) based misinformation interventions.
We present (1) an experiment with a simulated social media environment to measure effectiveness of misinformation interventions, and (2) a second experiment with personalized explanations tailored to the demographics and beliefs of users.
Our findings confirm that LLM-based interventions are highly effective at correcting user behavior.
arXiv Detail & Related papers (2024-10-13T18:16:50Z) - Modeling User Retention through Generative Flow Networks [34.74982897470852]
Flow-based modeling technique can back-propagate the retention reward towards each recommended item in the user session.
We show that the flow combined with traditional learning-to-rank objectives eventually optimized a non-discounted cumulative reward for both immediate user feedback and user retention.
arXiv Detail & Related papers (2024-06-10T06:22:18Z) - User Welfare Optimization in Recommender Systems with Competing Content Creators [65.25721571688369]
In this study, we perform system-side user welfare optimization under a competitive game setting among content creators.
We propose an algorithmic solution for the platform, which dynamically computes a sequence of weights for each user based on their satisfaction of the recommended content.
These weights are then utilized to design mechanisms that adjust the recommendation policy or the post-recommendation rewards, thereby influencing creators' content production strategies.
arXiv Detail & Related papers (2024-04-28T21:09:52Z) - Collaborative-Enhanced Prediction of Spending on Newly Downloaded Mobile Games under Consumption Uncertainty [49.431361908465036]
We propose a robust model training and evaluation framework to mitigate label variance and extremes.
Within this framework, we introduce a collaborative-enhanced model designed to predict user game spending without relying on user IDs.
Our approach demonstrates notable improvements over production models, achieving a remarkable textbf17.11% enhancement on offline data.
arXiv Detail & Related papers (2024-04-12T07:47:02Z) - Maximizing the Success Probability of Policy Allocations in Online
Systems [5.485872703839928]
In this paper we consider the problem at the level of user timelines instead of individual bid requests.
In order to optimally allocate policies to users, typical multiple treatments allocation methods solve knapsack-like problems.
We introduce the SuccessProMax algorithm that aims at finding the policy allocation which is the most likely to outperform a fixed policy.
arXiv Detail & Related papers (2023-12-26T10:55:33Z) - Online Ad Procurement in Non-stationary Autobidding Worlds [10.871587311621974]
We introduce a primal-dual algorithm for online decision making with multi-dimension decision variables, bandit feedback and long-term uncertain constraints.
We show that our algorithm achieves low regret in many worlds when procurement outcomes are generated through procedures that are adversarial, adversarially corrupted, periodic, and ergodic.
arXiv Detail & Related papers (2023-07-10T00:41:08Z) - Targeted Advertising on Social Networks Using Online Variational Tensor
Regression [19.586412285513962]
We propose what we believe is the first contextual bandit framework for online targeted advertising.
The proposed framework is designed to accommodate any number of feature vectors in the form of multi-mode tensor.
We empirically confirm that the proposedUCB algorithm achieves a significant improvement in influence tasks over the benchmarks.
arXiv Detail & Related papers (2022-08-22T22:10:45Z) - Adversarial Learning for Incentive Optimization in Mobile Payment
Marketing [17.645000197183045]
Payment platforms hold large-scale marketing campaigns, which allocate incentives to encourage users to pay through their applications.
To maximize the return on investment, incentive allocations are commonly solved in a two-stage procedure.
We propose a bias correction adversarial network to overcome this obstacle.
arXiv Detail & Related papers (2021-12-28T07:54:39Z) - Personalized multi-faceted trust modeling to determine trust links in
social media and its potential for misinformation management [61.88858330222619]
We present an approach for predicting trust links between peers in social media.
We propose a data-driven multi-faceted trust modeling which incorporates many distinct features for a comprehensive analysis.
Illustrated in a trust-aware item recommendation task, we evaluate the proposed framework in the context of a large Yelp dataset.
arXiv Detail & Related papers (2021-11-11T19:40:51Z) - Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential
Advertising [52.3825928886714]
We formulate the sequential advertising strategy optimization as a dynamic knapsack problem.
We propose a theoretically guaranteed bilevel optimization framework, which significantly reduces the solution space of the original optimization space.
To improve the exploration efficiency of reinforcement learning, we also devise an effective action space reduction approach.
arXiv Detail & Related papers (2020-06-29T18:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.