Taming the One-Epoch Phenomenon in Online Recommendation System by Two-stage Contrastive ID Pre-training
- URL: http://arxiv.org/abs/2508.18700v1
- Date: Tue, 26 Aug 2025 06:06:21 GMT
- Title: Taming the One-Epoch Phenomenon in Online Recommendation System by Two-stage Contrastive ID Pre-training
- Authors: Yi-Ping Hsu, Po-Wei Wang, Chantat Eksombatchai, Jiajing Xu,
- Abstract summary: We introduce a novel two-stage training strategy that incorporates a pre-training phase using a minimal model with contrastive loss.<n>Our offline experiments demonstrate that multi-epoch training during the pre-training phase does not lead to overfitting.
- Score: 1.1549572298362782
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: ID-based embeddings are widely used in web-scale online recommendation systems. However, their susceptibility to overfitting, particularly due to the long-tail nature of data distributions, often limits training to a single epoch, a phenomenon known as the "one-epoch problem." This challenge has driven research efforts to optimize performance within the first epoch by enhancing convergence speed or feature sparsity. In this study, we introduce a novel two-stage training strategy that incorporates a pre-training phase using a minimal model with contrastive loss, enabling broader data coverage for the embedding system. Our offline experiments demonstrate that multi-epoch training during the pre-training phase does not lead to overfitting, and the resulting embeddings improve online generalization when fine-tuned for more complex downstream recommendation tasks. We deployed the proposed system in live traffic at Pinterest, achieving significant site-wide engagement gains.
Related papers
- Towards High Data Efficiency in Reinforcement Learning with Verifiable Reward [54.708851958671794]
We propose a Data-Efficient Policy Optimization pipeline that combines optimized strategies for both offline and online data selection.<n>In offline phase, we curate a high-quality subset of training samples based on diversity, influence, and appropriate difficulty.<n>During online RLVR training, we introduce a sample-level explorability metric to dynamically filter samples with low exploration potential.
arXiv Detail & Related papers (2025-09-01T10:04:20Z) - A Scalable Pretraining Framework for Link Prediction with Efficient Adaptation [16.82426251068573]
Link Prediction (LP) is a critical task in graph machine learning.<n>Existing methods face key challenges including limited supervision from sparse connectivity.<n>We explore pretraining as a solution to address these challenges.
arXiv Detail & Related papers (2025-08-06T17:10:31Z) - CUPID: A Real-Time Session-Based Reciprocal Recommendation System for a One-on-One Social Discovery Platform [12.2116664055055]
CUPID is a novel approach to session-based reciprocal recommendation systems designed for a real-time one-on-one social discovery platform.
CUPID decouples the time-intensive user session modeling from the real-time user matching process to reduce inference time.
CUPID reduces response latency by more than 76% compared to non-asynchronous systems.
arXiv Detail & Related papers (2024-10-08T05:44:14Z) - Long-Sequence Recommendation Models Need Decoupled Embeddings [49.410906935283585]
We identify and characterize a neglected deficiency in existing long-sequence recommendation models.<n>A single set of embeddings struggles with learning both attention and representation, leading to interference between these two processes.<n>We propose the Decoupled Attention and Representation Embeddings (DARE) model, where two distinct embedding tables are learned separately to fully decouple attention and representation.
arXiv Detail & Related papers (2024-10-03T15:45:15Z) - Multi-Epoch learning with Data Augmentation for Deep Click-Through Rate Prediction [53.88231294380083]
We introduce a novel Multi-Epoch learning with Data Augmentation (MEDA) framework, suitable for both non-continual and continual learning scenarios.
MEDA minimizes overfitting by reducing the dependency of the embedding layer on subsequent training data.
Our findings confirm that pre-trained layers can adapt to new embedding spaces, enhancing performance without overfitting.
arXiv Detail & Related papers (2024-06-27T04:00:15Z) - Online Matching: A Real-time Bandit System for Large-scale
Recommendations [23.954049092470548]
Online Matching is a scalable closed-loop bandit system learning from users' direct feedback on items in real time.
Diag-LinUCB is a novel extension of the LinUCB algorithm to enable distributed updates of bandits parameter in a scalable and timely manner.
arXiv Detail & Related papers (2023-07-29T05:46:27Z) - Recursive Least-Squares Estimator-Aided Online Learning for Visual
Tracking [58.14267480293575]
We propose a simple yet effective online learning approach for few-shot online adaptation without requiring offline training.
It allows an in-built memory retention mechanism for the model to remember the knowledge about the object seen before.
We evaluate our approach based on two networks in the online learning families for tracking, i.e., multi-layer perceptrons in RT-MDNet and convolutional neural networks in DiMP.
arXiv Detail & Related papers (2021-12-28T06:51:18Z) - Regularizing Generative Adversarial Networks under Limited Data [88.57330330305535]
This work proposes a regularization approach for training robust GAN models on limited data.
We show a connection between the regularized loss and an f-divergence called LeCam-divergence, which we find is more robust under limited training data.
arXiv Detail & Related papers (2021-04-07T17:59:06Z) - Fast Class-wise Updating for Online Hashing [196.14748396106955]
This paper presents a novel supervised online hashing scheme, termed Fast Class-wise Updating for Online Hashing (FCOH)
A class-wise updating method is developed to decompose the binary code learning and alternatively renew the hash functions in a class-wise fashion, which well addresses the burden on large amounts of training batches.
To further achieve online efficiency, we propose a semi-relaxation optimization, which accelerates the online training by treating different binary constraints independently.
arXiv Detail & Related papers (2020-12-01T07:41:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.