A Practical Incremental Method to Train Deep CTR Models
- URL: http://arxiv.org/abs/2009.02147v1
- Date: Fri, 4 Sep 2020 12:35:42 GMT
- Title: A Practical Incremental Method to Train Deep CTR Models
- Authors: Yichao Wang, Huifeng Guo, Ruiming Tang, Zhirong Liu, Xiuqiang He
- Abstract summary: We introduce a practical incremental method to train deep CTR models, which consists of three decoupled modules.
Our method can achieve comparable performance to the conventional batch mode training with much better training efficiency.
- Score: 37.54660958085938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models in recommender systems are usually trained in the batch
mode, namely iteratively trained on a fixed-size window of training data. Such
batch mode training of deep learning models suffers from low training
efficiency, which may lead to performance degradation when the model is not
produced on time. To tackle this issue, incremental learning is proposed and
has received much attention recently. Incremental learning has great potential
in recommender systems, as two consecutive window of training data overlap most
of the volume. It aims to update the model incrementally with only the newly
incoming samples from the timestamp when the model is updated last time, which
is much more efficient than the batch mode training. However, most of the
incremental learning methods focus on the research area of image recognition
where new tasks or classes are learned over time. In this work, we introduce a
practical incremental method to train deep CTR models, which consists of three
decoupled modules (namely, data, feature and model module). Our method can
achieve comparable performance to the conventional batch mode training with
much better training efficiency. We conduct extensive experiments on a public
benchmark and a private dataset to demonstrate the effectiveness of our
proposed method.
Related papers
- Accelerating Deep Learning with Fixed Time Budget [2.190627491782159]
This paper proposes an effective technique for training arbitrary deep learning models within fixed time constraints.
The proposed method is extensively evaluated in both classification and regression tasks in computer vision.
arXiv Detail & Related papers (2024-10-03T21:18:04Z) - EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training [79.96741042766524]
We reformulate the training curriculum as a soft-selection function.
We show that exposing the contents of natural images can be readily achieved by the intensity of data augmentation.
The resulting method, EfficientTrain++, is simple, general, yet surprisingly effective.
arXiv Detail & Related papers (2024-05-14T17:00:43Z) - Always-Sparse Training by Growing Connections with Guided Stochastic
Exploration [46.4179239171213]
We propose an efficient always-sparse training algorithm with excellent scaling to larger and sparser models.
We evaluate our method on CIFAR-10/100 and ImageNet using VGG, and ViT models, and compare it against a range of sparsification methods.
arXiv Detail & Related papers (2024-01-12T21:32:04Z) - PILOT: A Pre-Trained Model-Based Continual Learning Toolbox [71.63186089279218]
This paper introduces a pre-trained model-based continual learning toolbox known as PILOT.
On the one hand, PILOT implements some state-of-the-art class-incremental learning algorithms based on pre-trained models, such as L2P, DualPrompt, and CODA-Prompt.
On the other hand, PILOT fits typical class-incremental learning algorithms within the context of pre-trained models to evaluate their effectiveness.
arXiv Detail & Related papers (2023-09-13T17:55:11Z) - BOOT: Data-free Distillation of Denoising Diffusion Models with
Bootstrapping [64.54271680071373]
Diffusion models have demonstrated excellent potential for generating diverse images.
Knowledge distillation has been recently proposed as a remedy that can reduce the number of inference steps to one or a few.
We present a novel technique called BOOT, that overcomes limitations with an efficient data-free distillation algorithm.
arXiv Detail & Related papers (2023-06-08T20:30:55Z) - EfficientTrain: Exploring Generalized Curriculum Learning for Training
Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers)
As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z) - Effective and Efficient Training for Sequential Recommendation using
Recency Sampling [91.02268704681124]
We propose a novel Recency-based Sampling of Sequences training objective.
We show that the models enhanced with our method can achieve performances exceeding or very close to stateof-the-art BERT4Rec.
arXiv Detail & Related papers (2022-07-06T13:06:31Z) - Effective training-time stacking for ensembling of deep neural networks [1.2667973028134798]
A snapshot ensembling collects models in the ensemble along a single training path.
Our method improves snapshot ensembling by selecting and weighting ensemble members along the training path.
It relies on training-time likelihoods without looking at validation sample errors that standard stacking methods do.
arXiv Detail & Related papers (2022-06-27T17:52:53Z) - MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided
Adaptation [68.30497162547768]
We propose MoEBERT, which uses a Mixture-of-Experts structure to increase model capacity and inference speed.
We validate the efficiency and effectiveness of MoEBERT on natural language understanding and question answering tasks.
arXiv Detail & Related papers (2022-04-15T23:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.