LT4REC:A Lottery Ticket Hypothesis Based Multi-task Practice for Video
Recommendation System
- URL: http://arxiv.org/abs/2008.09872v2
- Date: Thu, 14 Oct 2021 16:24:40 GMT
- Title: LT4REC:A Lottery Ticket Hypothesis Based Multi-task Practice for Video
Recommendation System
- Authors: Xuanji Xiao, Huabin Chen, Yuzhen Liu, Xing Yao, Pei Liu, Chaosheng
Fan, Nian Ji, Xirong Jiang
- Abstract summary: Click-through rate prediction (CTR) and post-click conversion rate prediction (CVR) play key roles across all industrial ranking systems.
In this paper, we model CVR in a brand-new method by adopting the lottery-ticket-hypothesis-based sparse sharing multi-task learning.
Experiments on the dataset gathered from traffic logs of Tencent video's recommendation system demonstrate that sparse sharing in the CVR model significantly outperforms competitive methods.
- Score: 2.7174057828883504
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Click-through rate prediction (CTR) and post-click conversion rate prediction
(CVR) play key roles across all industrial ranking systems, such as
recommendation systems, online advertising, and search engines. Different from
the extensive research on CTR, there is much less research on CVR estimation,
whose main challenge is extreme data sparsity with one or two orders of
magnitude reduction in the number of samples than CTR. People try to solve this
problem with the paradigm of multi-task learning with the sufficient samples of
CTR, but the typical hard sharing method can't effectively solve this problem,
because it is difficult to analyze which parts of network components can be
shared and which parts are in conflict, i.e., there is a large inaccuracy with
artificially designed neurons sharing. In this paper, we model CVR in a
brand-new method by adopting the lottery-ticket-hypothesis-based sparse sharing
multi-task learning, which can automatically and flexibly learn which neuron
weights to be shared without artificial experience. Experiments on the dataset
gathered from traffic logs of Tencent video's recommendation system demonstrate
that sparse sharing in the CVR model significantly outperforms competitive
methods. Due to the nature of weight sparsity in sparse sharing, it can also
significantly reduce computational complexity and memory usage which are very
important in the industrial recommendation system.
Related papers
- Regularized Contrastive Partial Multi-view Outlier Detection [76.77036536484114]
We propose a novel method named Regularized Contrastive Partial Multi-view Outlier Detection (RCPMOD)
In this framework, we utilize contrastive learning to learn view-consistent information and distinguish outliers by the degree of consistency.
Experimental results on four benchmark datasets demonstrate that our proposed approach could outperform state-of-the-art competitors.
arXiv Detail & Related papers (2024-08-02T14:34:27Z) - Investigating the Robustness of Counterfactual Learning to Rank Models: A Reproducibility Study [61.64685376882383]
Counterfactual learning to rank (CLTR) has attracted extensive attention in the IR community for its ability to leverage massive logged user interaction data to train ranking models.
This paper investigates the robustness of existing CLTR models in complex and diverse situations.
We find that the DLA models and IPS-DCM show better robustness under various simulation settings than IPS-PBM and PRS with offline propensity estimation.
arXiv Detail & Related papers (2024-04-04T10:54:38Z) - RAT: Retrieval-Augmented Transformer for Click-Through Rate Prediction [68.34355552090103]
This paper develops a Retrieval-Augmented Transformer (RAT), aiming to acquire fine-grained feature interactions within and across samples.
We then build Transformer layers with cascaded attention to capture both intra- and cross-sample feature interactions.
Experiments on real-world datasets substantiate the effectiveness of RAT and suggest its advantage in long-tail scenarios.
arXiv Detail & Related papers (2024-04-02T19:14:23Z) - MAP: A Model-agnostic Pretraining Framework for Click-through Rate
Prediction [39.48740397029264]
We propose a Model-agnostic pretraining (MAP) framework that applies feature corruption and recovery on multi-field categorical data.
We derive two practical algorithms: masked feature prediction (RFD) and replaced feature detection (RFD)
arXiv Detail & Related papers (2023-08-03T12:55:55Z) - Contrastive Learning for Conversion Rate Prediction [6.607531486024888]
We propose Contrastive Learning for CVR prediction (CL4CVR) framework.
It associates the supervised CVR prediction task with a contrastive learning task, which can learn better data representations.
Experimental results on two real-world conversion datasets demonstrate the superior performance of CL4CVR.
arXiv Detail & Related papers (2023-07-12T07:42:52Z) - Contrastive Multi-view Framework for Customer Lifetime Value Prediction [48.24479287526052]
Many existing LTV prediction methods directly train a single-view LTV predictor on consumption samples.
We propose a contrastive multi-view framework for LTV prediction, which is a plug-and-play solution compatible with various backbone models.
We conduct extensive experiments on a real-world game LTV prediction dataset and the results validate the effectiveness of our method.
arXiv Detail & Related papers (2023-06-26T03:23:53Z) - Unifying Synergies between Self-supervised Learning and Dynamic
Computation [53.66628188936682]
We present a novel perspective on the interplay between SSL and DC paradigms.
We show that it is feasible to simultaneously learn a dense and gated sub-network from scratch in a SSL setting.
The co-evolution during pre-training of both dense and gated encoder offers a good accuracy-efficiency trade-off.
arXiv Detail & Related papers (2023-01-22T17:12:58Z) - Entire Space Counterfactual Learning: Tuning, Analytical Properties and
Industrial Applications [5.9460659646670875]
Post-click conversion rate (CVR) estimation has long been plagued by sample selection bias and data sparsity issues.
This paper proposes a principled method named entire space counterfactual multi-task model (ESCM$2$), which employs a counterfactual risk minimizer to handle both IEB and PIP issues at once.
arXiv Detail & Related papers (2022-10-20T06:19:50Z) - Continual Learning for CTR Prediction: A Hybrid Approach [37.668467137218286]
We propose COLF, a hybrid COntinual Learning Framework for CTR prediction.
COLF has a memory-based modular architecture that is designed to adapt, learn and give predictions continuously.
Empirical evaluations on click log collected from a major shopping app in China demonstrate our method's superiority over existing methods.
arXiv Detail & Related papers (2022-01-18T11:30:57Z) - An Analysis Of Entire Space Multi-Task Models For Post-Click Conversion
Prediction [3.2979460528864926]
We consider approximating the probability of post-click conversion events (installs) for mobile app advertising on a large-scale advertising platform.
We show that several different approaches result in similar levels of positive transfer from the data-abundant CTR task to the CVR task.
Our findings add to the growing body of evidence suggesting that standard multi-task learning is a sensible approach to modelling related events in real-world large-scale applications.
arXiv Detail & Related papers (2021-08-18T13:39:50Z) - A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust
Neural Acoustic Scene Classification [78.04177357888284]
We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC)
We report an efficient joint framework for low-complexity multi-device ASC, called Acoustic Lottery.
arXiv Detail & Related papers (2021-07-03T16:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.