Always Strengthen Your Strengths: A Drift-Aware Incremental Learning
Framework for CTR Prediction
- URL: http://arxiv.org/abs/2304.09062v1
- Date: Mon, 17 Apr 2023 05:45:18 GMT
- Title: Always Strengthen Your Strengths: A Drift-Aware Incremental Learning
Framework for CTR Prediction
- Authors: Congcong Liu, Fei Teng, Xiwei Zhao, Zhangang Lin, Jinghe Hu, Jingping
Shao
- Abstract summary: Click-through rate (CTR) prediction is of great importance in recommendation systems and online advertising platforms.
Streaming data has the characteristic that the underlying distribution drifts over time and may recur.
We design a novel drift-aware incremental learning framework based on ensemble learning to address catastrophic forgetting in CTR prediction.
- Score: 4.909628097144909
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Click-through rate (CTR) prediction is of great importance in recommendation
systems and online advertising platforms. When served in industrial scenarios,
the user-generated data observed by the CTR model typically arrives as a
stream. Streaming data has the characteristic that the underlying distribution
drifts over time and may recur. This can lead to catastrophic forgetting if the
model simply adapts to new data distribution all the time. Also, it's
inefficient to relearn distribution that has been occurred. Due to memory
constraints and diversity of data distributions in large-scale industrial
applications, conventional strategies for catastrophic forgetting such as
replay, parameter isolation, and knowledge distillation are difficult to be
deployed. In this work, we design a novel drift-aware incremental learning
framework based on ensemble learning to address catastrophic forgetting in CTR
prediction. With explicit error-based drift detection on streaming data, the
framework further strengthens well-adapted ensembles and freezes ensembles that
do not match the input distribution avoiding catastrophic interference. Both
evaluations on offline experiments and A/B test shows that our method
outperforms all baselines considered.
Related papers
- Towards Continually Learning Application Performance Models [1.2278517240988065]
Machine learning-based performance models are increasingly being used to build critical job scheduling and application optimization decisions.
Traditionally, these models assume that data distribution does not change as more samples are collected over time.
We develop continually learning performance models that account for the distribution drift, alleviate catastrophic forgetting, and improve generalizability.
arXiv Detail & Related papers (2023-10-25T20:48:46Z) - Consistent Diffusion Models: Mitigating Sampling Drift by Learning to be
Consistent [97.64313409741614]
We propose to enforce a emphconsistency property which states that predictions of the model on its own generated data are consistent across time.
We show that our novel training objective yields state-of-the-art results for conditional and unconditional generation in CIFAR-10 and baseline improvements in AFHQ and FFHQ.
arXiv Detail & Related papers (2023-02-17T18:45:04Z) - On-Device Model Fine-Tuning with Label Correction in Recommender Systems [43.41875046295657]
This work focuses on the fundamental click-through rate (CTR) prediction task in recommender systems.
We propose a novel label correction method, which requires each user only to change the labels of the local samples ahead of on-device fine-tuning.
arXiv Detail & Related papers (2022-10-21T14:40:18Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Concept Drift Adaptation for CTR Prediction in Online Advertising
Systems [6.900209851954917]
Click-through rate (CTR) prediction is a crucial task in web search, recommender systems, and online advertisement displaying.
In this paper, we propose adaptive mixture of experts (AdaMoE) to alleviate the concept drift problem by adaptive filtering in the data stream of CTR prediction.
arXiv Detail & Related papers (2022-04-01T07:43:43Z) - Continual Learning for CTR Prediction: A Hybrid Approach [37.668467137218286]
We propose COLF, a hybrid COntinual Learning Framework for CTR prediction.
COLF has a memory-based modular architecture that is designed to adapt, learn and give predictions continuously.
Empirical evaluations on click log collected from a major shopping app in China demonstrate our method's superiority over existing methods.
arXiv Detail & Related papers (2022-01-18T11:30:57Z) - Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions.
In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data.
We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z) - Employing chunk size adaptation to overcome concept drift [2.277447144331876]
We propose a new Chunk Adaptive Restoration framework that can be adapted to any block-based data stream classification algorithm.
The proposed algorithm adjusts the data chunk size in the case of concept drift detection to minimize the impact of the change on the predictive performance of the used model.
arXiv Detail & Related papers (2021-10-25T12:36:22Z) - Self-Damaging Contrastive Learning [92.34124578823977]
Unlabeled data in reality is commonly imbalanced and shows a long-tail distribution.
This paper proposes a principled framework called Self-Damaging Contrastive Learning to automatically balance the representation learning without knowing the classes.
Our experiments show that SDCLR significantly improves not only overall accuracies but also balancedness.
arXiv Detail & Related papers (2021-06-06T00:04:49Z) - Churn Reduction via Distillation [54.5952282395487]
We show an equivalence between training with distillation using the base model as the teacher and training with an explicit constraint on the predictive churn.
We then show that distillation performs strongly for low churn training against a number of recent baselines.
arXiv Detail & Related papers (2021-06-04T18:03:31Z) - Over-the-Air Federated Learning from Heterogeneous Data [107.05618009955094]
Federated learning (FL) is a framework for distributed learning of centralized models.
We develop a Convergent OTA FL (COTAF) algorithm which enhances the common local gradient descent (SGD) FL algorithm.
We numerically show that the precoding induced by COTAF notably improves the convergence rate and the accuracy of models trained via OTA FL.
arXiv Detail & Related papers (2020-09-27T08:28:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.