Related papers: On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models

On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models

URL: http://arxiv.org/abs/2209.05310v1
Date: Mon, 12 Sep 2022 15:15:23 GMT
Title: On the Factory Floor: ML Engineering for Industrial-Scale Ads Recommendation Models
Authors: Rohan Anil, Sandra Gadanho, Da Huang, Nijith Jacob, Zhuoshu Li, Dong Lin, Todd Phillips, Cristina Pop, Kevin Regan, Gil I. Shamir, Rakesh Shivanna, Qiqi Yan
Abstract summary: For industrial-scale advertising systems, prediction of ad click-through rate (CTR) is a central problem. We present a case study of practical techniques deployed in Google's search ads CTR model.
Score: 9.102290972714652
License: http://creativecommons.org/licenses/by/4.0/
Abstract: For industrial-scale advertising systems, prediction of ad click-through rate (CTR) is a central problem. Ad clicks constitute a significant class of user engagements and are often used as the primary signal for the usefulness of ads to users. Additionally, in cost-per-click advertising systems where advertisers are charged per click, click rate expectations feed directly into value estimation. Accordingly, CTR model development is a significant investment for most Internet advertising companies. Engineering for such problems requires many machine learning (ML) techniques suited to online learning that go well beyond traditional accuracy improvements, especially concerning efficiency, reproducibility, calibration, credit attribution. We present a case study of practical techniques deployed in Google's search ads CTR model. This paper provides an industry case study highlighting important areas of current ML research and illustrating how impactful new ML methods are evaluated and made useful in a large-scale industrial setting.

Related papers

CiteFix: Enhancing RAG Accuracy Through Post-Processing Citation Correction [0.2548904650574671]
Retrieval Augmented Generation (RAG) has emerged as a powerful application of Large Language Models (LLMs) This research contributes to enhancing the reliability and trustworthiness of AI-generated content in information retrieval and summarization tasks.
arXiv Detail & Related papers (2025-04-22T06:41:25Z)
CTR-Driven Advertising Image Generation with Multimodal Large Language Models [53.40005544344148]
We explore the use of Multimodal Large Language Models (MLLMs) for generating advertising images by optimizing for Click-Through Rate (CTR) as the primary objective. To further improve the CTR of generated images, we propose a novel reward model to fine-tune pre-trained MLLMs through Reinforcement Learning (RL) Our method achieves state-of-the-art performance in both online and offline metrics.
arXiv Detail & Related papers (2025-02-05T09:06:02Z)
Efficient Transfer Learning Framework for Cross-Domain Click-Through Rate Prediction [47.7066461216227]
Efficient Transfer Learning Framework for Cross-Domain Click-Through Rate Prediction (E-CDCTR) Three key components: Tiny Pre-training Model (TPM), Complete Pre-training Model (CPM) and Advertisement CTR model (A-CTR) TPM provides richer representations of user and item for both the CPM and A-CTR, effectively alleviating the problem inherent in the daily updates.
arXiv Detail & Related papers (2024-08-29T03:34:39Z)
Improving conversion rate prediction via self-supervised pre-training in online advertising [2.447795279790662]
Key challenge in training models that predict conversions-given-clicks comes from data sparsity. We use the well-known idea of self-supervised pre-training, and use an auxiliary auto-encoder model trained on all conversion events. We show improvements both offline, during training, and in an online A/B test.
arXiv Detail & Related papers (2024-01-25T08:44:22Z)
Unified Off-Policy Learning to Rank: a Reinforcement Learning Perspective [61.4025671743675]
Off-policy learning to rank methods often make strong assumptions about how users generate the click data. We show that offline reinforcement learning can adapt to various click models without complex debiasing techniques and prior knowledge of the model. Results on various large-scale datasets demonstrate that CUOLR consistently outperforms the state-of-the-art off-policy learning to rank algorithms.
arXiv Detail & Related papers (2023-06-13T03:46:22Z)
Benchmarking Automated Machine Learning Methods for Price Forecasting Applications [58.720142291102135]
We show the possibility of substituting manually created ML pipelines with automated machine learning (AutoML) solutions. Based on the CRISP-DM process, we split the manual ML pipeline into a machine learning and non-machine learning part. We show in a case study for the industrial use case of price forecasting, that domain knowledge combined with AutoML can weaken the dependence on ML experts.
arXiv Detail & Related papers (2023-04-28T10:27:38Z)
Practical Lessons on Optimizing Sponsored Products in eCommerce [6.245623148893172]
We study multiple problems from sponsored product optimization in ad system, including position-based de-biasing, click-conversion multi-task learning, and calibration on predicted click-through-rate (pCTR) We propose a practical machine learning framework that provides the solutions without structural change to existing machine learning models.
arXiv Detail & Related papers (2023-04-05T21:46:20Z)
Canary in a Coalmine: Better Membership Inference with Ensembled Adversarial Queries [53.222218035435006]
We use adversarial tools to optimize for queries that are discriminative and diverse. Our improvements achieve significantly more accurate membership inference than existing methods.
arXiv Detail & Related papers (2022-10-19T17:46:50Z)
Rethinking Position Bias Modeling with Knowledge Distillation for CTR Prediction [8.414183573280779]
This work proposes a knowledge distillation framework to alleviate the impact of position bias and leverage position information to improve CTR prediction. The proposed method has been deployed in the real world online ads systems, serving main traffic on one of the world's largest e-commercial platforms.
arXiv Detail & Related papers (2022-04-01T07:58:38Z)
Click-Through Rate Prediction in Online Advertising: A Literature Review [0.0]
We make a systematic literature review on state-of-the-art and latest CTR prediction research. We give a classification of state-of-the-art CTR prediction models in the extant literature. We identify current research trends, main challenges and potential future directions worthy of further explorations.
arXiv Detail & Related papers (2022-02-22T01:05:38Z)
A First Look at Class Incremental Learning in Deep Learning Mobile Traffic Classification [68.11005070665364]
We explore Incremental Learning (IL) techniques to add new classes to models without a full retraining, hence speeding up model's updates cycle. We consider iCarl, a state of the art IL method, and MIRAGE-2019, a public dataset with traffic from 40 Android apps. Despite our analysis reveals their infancy, IL techniques are a promising research area on the roadmap towards automated DL-based traffic analysis systems.
arXiv Detail & Related papers (2021-07-09T14:28:16Z)
Automated Machine Learning Techniques for Data Streams [91.3755431537592]
This paper surveys the state-of-the-art open-source AutoML tools, applies them to data collected from streams, and measures how their performance changes over time. The results show that off-the-shelf AutoML tools can provide satisfactory results but in the presence of concept drift, detection or adaptation techniques have to be applied to maintain the predictive accuracy over time.
arXiv Detail & Related papers (2021-06-14T11:42:46Z)
Learning Graph Meta Embeddings for Cold-Start Ads in Click-Through Rate Prediction [14.709092114902159]
We propose Graph Meta Embedding (GME) models that can rapidly learn how to generate desirable initial embeddings for new ad IDs. Experimental results on three real-world datasets show that GMEs can significantly improve the prediction performance in both cold-start and warm-up.
arXiv Detail & Related papers (2021-05-19T03:46:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.