On the Factory Floor: ML Engineering for Industrial-Scale Ads
Recommendation Models
- URL: http://arxiv.org/abs/2209.05310v1
- Date: Mon, 12 Sep 2022 15:15:23 GMT
- Title: On the Factory Floor: ML Engineering for Industrial-Scale Ads
Recommendation Models
- Authors: Rohan Anil, Sandra Gadanho, Da Huang, Nijith Jacob, Zhuoshu Li, Dong
Lin, Todd Phillips, Cristina Pop, Kevin Regan, Gil I. Shamir, Rakesh
Shivanna, Qiqi Yan
- Abstract summary: For industrial-scale advertising systems, prediction of ad click-through rate (CTR) is a central problem.
We present a case study of practical techniques deployed in Google's search ads CTR model.
- Score: 9.102290972714652
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For industrial-scale advertising systems, prediction of ad click-through rate
(CTR) is a central problem. Ad clicks constitute a significant class of user
engagements and are often used as the primary signal for the usefulness of ads
to users. Additionally, in cost-per-click advertising systems where advertisers
are charged per click, click rate expectations feed directly into value
estimation. Accordingly, CTR model development is a significant investment for
most Internet advertising companies. Engineering for such problems requires
many machine learning (ML) techniques suited to online learning that go well
beyond traditional accuracy improvements, especially concerning efficiency,
reproducibility, calibration, credit attribution. We present a case study of
practical techniques deployed in Google's search ads CTR model. This paper
provides an industry case study highlighting important areas of current ML
research and illustrating how impactful new ML methods are evaluated and made
useful in a large-scale industrial setting.
Related papers
- Learning Fair Ranking Policies via Differentiable Optimization of
Ordered Weighted Averages [55.04219793298687]
This paper shows how efficiently-solvable fair ranking models can be integrated into the training loop of Learning to Rank.
In particular, this paper is the first to show how to backpropagate through constrained optimizations of OWA objectives, enabling their use in integrated prediction and decision models.
arXiv Detail & Related papers (2024-02-07T20:53:53Z) - Improving conversion rate prediction via self-supervised pre-training in
online advertising [2.447795279790662]
Key challenge in training models that predict conversions-given-clicks comes from data sparsity.
We use the well-known idea of self-supervised pre-training, and use an auxiliary auto-encoder model trained on all conversion events.
We show improvements both offline, during training, and in an online A/B test.
arXiv Detail & Related papers (2024-01-25T08:44:22Z) - Unified Off-Policy Learning to Rank: a Reinforcement Learning
Perspective [61.4025671743675]
Off-policy learning to rank methods often make strong assumptions about how users generate the click data.
We show that offline reinforcement learning can adapt to various click models without complex debiasing techniques and prior knowledge of the model.
Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms.
arXiv Detail & Related papers (2023-06-13T03:46:22Z) - Benchmarking Automated Machine Learning Methods for Price Forecasting
Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions.
Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part.
We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z) - Practical Lessons on Optimizing Sponsored Products in eCommerce [6.245623148893172]
We study multiple problems from sponsored product optimization in ad system, including position-based de-biasing, click-conversion multi-task learning, and calibration on predicted click-through-rate (pCTR)
We propose a practical machine learning framework that provides the solutions without structural change to existing machine learning models.
arXiv Detail & Related papers (2023-04-05T21:46:20Z) - Canary in a Coalmine: Better Membership Inference with Ensembled
Adversarial Queries [53.222218035435006]
We use adversarial tools to optimize for queries that are discriminative and diverse.
Our improvements achieve significantly more accurate membership inference than existing methods.
arXiv Detail & Related papers (2022-10-19T17:46:50Z) - Rethinking Position Bias Modeling with Knowledge Distillation for CTR
Prediction [8.414183573280779]
This work proposes a knowledge distillation framework to alleviate the impact of position bias and leverage position information to improve CTR prediction.
The proposed method has been deployed in the real world online ads systems, serving main traffic on one of the world's largest e-commercial platforms.
arXiv Detail & Related papers (2022-04-01T07:58:38Z) - Click-Through Rate Prediction in Online Advertising: A Literature Review [0.0]
We make a systematic literature review on state-of-the-art and latest CTR prediction research.
We give a classification of state-of-the-art CTR prediction models in the extant literature.
We identify current research trends, main challenges and potential future directions worthy of further explorations.
arXiv Detail & Related papers (2022-02-22T01:05:38Z) - A First Look at Class Incremental Learning in Deep Learning Mobile
Traffic Classification [68.11005070665364]
We explore Incremental Learning (IL) techniques to add new classes to models without a full retraining, hence speeding up model's updates cycle.
We consider iCarl, a state of the art IL method, and MIRAGE-2019, a public dataset with traffic from 40 Android apps.
Despite our analysis reveals their infancy, IL techniques are a promising research area on the roadmap towards automated DL-based traffic analysis systems.
arXiv Detail & Related papers (2021-07-09T14:28:16Z) - Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time.
The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z) - Learning Graph Meta Embeddings for Cold-Start Ads in Click-Through Rate
Prediction [14.709092114902159]
We propose Graph Meta Embedding (GME) models that can rapidly learn how to generate desirable initial embeddings for new ad IDs.
Experimental results on three real-world datasets show that GMEs can significantly improve the prediction performance in both cold-start and warm-up.
arXiv Detail & Related papers (2021-05-19T03:46:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.