Improving conversion rate prediction via self-supervised pre-training in
online advertising
- URL: http://arxiv.org/abs/2401.16432v1
- Date: Thu, 25 Jan 2024 08:44:22 GMT
- Title: Improving conversion rate prediction via self-supervised pre-training in
online advertising
- Authors: Alex Shtoff, Yohay Kaplan, Ariel Raviv
- Abstract summary: Key challenge in training models that predict conversions-given-clicks comes from data sparsity.
We use the well-known idea of self-supervised pre-training, and use an auxiliary auto-encoder model trained on all conversion events.
We show improvements both offline, during training, and in an online A/B test.
- Score: 2.447795279790662
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The task of predicting conversion rates (CVR) lies at the heart of online
advertising systems aiming to optimize bids to meet advertiser performance
requirements. Even with the recent rise of deep neural networks, these
predictions are often made by factorization machines (FM), especially in
commercial settings where inference latency is key. These models are trained
using the logistic regression framework on labeled tabular data formed from
past user activity that is relevant to the task at hand.
Many advertisers only care about click-attributed conversions. A major
challenge in training models that predict conversions-given-clicks comes from
data sparsity - clicks are rare, conversions attributed to clicks are even
rarer. However, mitigating sparsity by adding conversions that are not
click-attributed to the training set impairs model calibration. Since
calibration is critical to achieving advertiser goals, this is infeasible.
In this work we use the well-known idea of self-supervised pre-training, and
use an auxiliary auto-encoder model trained on all conversion events, both
click-attributed and not, as a feature extractor to enrich the main CVR
prediction model. Since the main model does not train on non click-attributed
conversions, this does not impair calibration. We adapt the basic
self-supervised pre-training idea to our online advertising setup by using a
loss function designed for tabular data, facilitating continual learning by
ensuring auto-encoder stability, and incorporating a neural network into a
large-scale real-time ad auction that ranks tens of thousands of ads, under
strict latency constraints, and without incurring a major engineering cost. We
show improvements both offline, during training, and in an online A/B test.
Following its success in A/B tests, our solution is now fully deployed to the
Yahoo native advertising system.
Related papers
- CALICO: Confident Active Learning with Integrated Calibration [11.978551396144532]
We propose an AL framework that self-calibrates the confidence used for sample selection during the training process.
We show improved classification performance compared to a softmax-based classifier with fewer labeled samples.
arXiv Detail & Related papers (2024-07-02T15:05:19Z) - QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models -- Extended Version [34.280197473547226]
Machine learning models can be deployed on edge devices with limited storage and computational capabilities.
We propose QCore to enable continual calibration on the edge.
arXiv Detail & Related papers (2024-04-22T08:57:46Z) - Unbiased Filtering Of Accidental Clicks in Verizon Media Native
Advertising [1.6717433307723157]
We focus on the challenge of predicting click-through rates (CTR) when we are aware that some of the clicks have short dwell-time.
An accidental click implies little affinity between the user and the ad, so predicting that similar users will click on the ad is inaccurate.
We present a new approach where the positive weight of the accidental clicks is distributed among all of the negative events (skips), based on their likelihood of causing accidental clicks.
arXiv Detail & Related papers (2023-12-08T12:54:30Z) - Unsupervised Domain Adaptation for Self-Driving from Past Traversal
Features [69.47588461101925]
We propose a method to adapt 3D object detectors to new driving environments.
Our approach enhances LiDAR-based detection models using spatial quantized historical features.
Experiments on real-world datasets demonstrate significant improvements.
arXiv Detail & Related papers (2023-09-21T15:00:31Z) - LegoNet: A Fast and Exact Unlearning Architecture [59.49058450583149]
Machine unlearning aims to erase the impact of specific training samples upon deleted requests from a trained model.
We present a novel network, namely textitLegoNet, which adopts the framework of fixed encoder + multiple adapters''
We show that LegoNet accomplishes fast and exact unlearning while maintaining acceptable performance, synthetically outperforming unlearning baselines.
arXiv Detail & Related papers (2022-10-28T09:53:05Z) - Canary in a Coalmine: Better Membership Inference with Ensembled
Adversarial Queries [53.222218035435006]
We use adversarial tools to optimize for queries that are discriminative and diverse.
Our improvements achieve significantly more accurate membership inference than existing methods.
arXiv Detail & Related papers (2022-10-19T17:46:50Z) - On the Factory Floor: ML Engineering for Industrial-Scale Ads
Recommendation Models [9.102290972714652]
For industrial-scale advertising systems, prediction of ad click-through rate (CTR) is a central problem.
We present a case study of practical techniques deployed in Google's search ads CTR model.
arXiv Detail & Related papers (2022-09-12T15:15:23Z) - Augmented Bilinear Network for Incremental Multi-Stock Time-Series
Classification [83.23129279407271]
We propose a method to efficiently retain the knowledge available in a neural network pre-trained on a set of securities.
In our method, the prior knowledge encoded in a pre-trained neural network is maintained by keeping existing connections fixed.
This knowledge is adjusted for the new securities by a set of augmented connections, which are optimized using the new data.
arXiv Detail & Related papers (2022-07-23T18:54:10Z) - Adversarial Unlearning: Reducing Confidence Along Adversarial Directions [88.46039795134993]
We propose a complementary regularization strategy that reduces confidence on self-generated examples.
The method, which we call RCAD, aims to reduce confidence on out-of-distribution examples lying along directions adversarially chosen to increase training loss.
Despite its simplicity, we find on many classification benchmarks that RCAD can be added to existing techniques to increase test accuracy by 1-3% in absolute value.
arXiv Detail & Related papers (2022-06-03T02:26:24Z) - Churn prediction in online gambling [4.523089386111081]
This work contributes to the domain by formalizing the problem of churn prediction in the context of online gambling.
We propose an algorithmic answer to this problem based on recurrent neural network.
This algorithm is tested with online gambling data that have the form of time series.
arXiv Detail & Related papers (2022-01-07T14:20:25Z) - Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness
and Accuracy for Free [115.81899803240758]
Adversarial training and its many variants substantially improve deep network robustness, yet at the cost of compromising standard accuracy.
This paper asks how to quickly calibrate a trained model in-situ, to examine the achievable trade-offs between its standard and robust accuracies.
Our proposed framework, Once-for-all Adversarial Training (OAT), is built on an innovative model-conditional training framework.
arXiv Detail & Related papers (2020-10-22T16:06:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.