Attention Weighted Mixture of Experts with Contrastive Learning for
Personalized Ranking in E-commerce
- URL: http://arxiv.org/abs/2306.05011v1
- Date: Thu, 8 Jun 2023 07:59:08 GMT
- Title: Attention Weighted Mixture of Experts with Contrastive Learning for
Personalized Ranking in E-commerce
- Authors: Juan Gong, Zhenlin Chen, Chaoyi Ma, Zhuojian Xiao, Haonan Wang, Guoyu
Tang, Lin Liu, Sulong Xu, Bo Long, Yunjiang Jiang
- Abstract summary: We propose Attention Weighted Mixture of Experts (AW-MoE) with contrastive learning for personalized ranking.
AW-MoE has been successfully deployed in the JD e-commerce search engine.
- Score: 21.7796124109
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ranking model plays an essential role in e-commerce search and
recommendation. An effective ranking model should give a personalized ranking
list for each user according to the user preference. Existing algorithms
usually extract a user representation vector from the user behavior sequence,
then feed the vector into a feed-forward network (FFN) together with other
features for feature interactions, and finally produce a personalized ranking
score. Despite tremendous progress in the past, there is still room for
improvement. Firstly, the personalized patterns of feature interactions for
different users are not explicitly modeled. Secondly, most of existing
algorithms have poor personalized ranking results for long-tail users with few
historical behaviors due to the data sparsity. To overcome the two challenges,
we propose Attention Weighted Mixture of Experts (AW-MoE) with contrastive
learning for personalized ranking. Firstly, AW-MoE leverages the MoE framework
to capture personalized feature interactions for different users. To model the
user preference, the user behavior sequence is simultaneously fed into expert
networks and the gate network. Within the gate network, one gate unit and one
activation unit are designed to adaptively learn the fine-grained activation
vector for experts using an attention mechanism. Secondly, a random masking
strategy is applied to the user behavior sequence to simulate long-tail users,
and an auxiliary contrastive loss is imposed to the output of the gate network
to improve the model generalization for these users. This is validated by a
higher performance gain on the long-tail user test set. Experiment results on a
JD real production dataset and a public dataset demonstrate the effectiveness
of AW-MoE, which significantly outperforms state-of-art methods. Notably,
AW-MoE has been successfully deployed in the JD e-commerce search engine, ...
Related papers
- Multi-granularity Interest Retrieval and Refinement Network for Long-Term User Behavior Modeling in CTR Prediction [68.90783662117936]
Click-through Rate (CTR) prediction is crucial for online personalization platforms.
Recent advancements have shown that modeling rich user behaviors can significantly improve the performance of CTR prediction.
We propose Multi-granularity Interest Retrieval and Refinement Network (MIRRN)
arXiv Detail & Related papers (2024-11-22T15:29:05Z) - USE: Dynamic User Modeling with Stateful Sequence Models [26.74966828348815]
User Stateful Embedding (USE) generates user embeddings without the need for exhaustive reprocessing.
We introduce a novel training objective named future W-behavior prediction to transcend the limitations of next-token prediction.
We conduct experiments on 8 downstream tasks using Snapchat users' behavioral logs in both static (i.e., fixed user behavior sequences) and dynamic (i.e. periodically updated user behavior sequences) settings.
arXiv Detail & Related papers (2024-03-20T07:05:19Z) - AdaptSSR: Pre-training User Model with Augmentation-Adaptive
Self-Supervised Ranking [19.1857792382924]
We propose Augmentation-Supervised Ranking (AdaptSSR) to replace the contrastive learning task.
We adopt a multiple pairwise ranking loss which trains the user model to capture the similarity orders between the implicitly augmented view, the explicitly augmented view, and views from other users.
Experiments on both public and industrial datasets with six downstream tasks verify the effectiveness of AdaptSSR.
arXiv Detail & Related papers (2023-10-15T02:19:28Z) - TransAct: Transformer-based Realtime User Action Model for
Recommendation at Pinterest [17.247452803197362]
This paper presents Pinterest's ranking architecture for Homefeed.
We propose TransAct, a sequential model that extracts users' short-term preferences from their realtime activities.
We describe the results of ablation studies, the challenges we faced during productionization, and the outcome of an online A/B experiment.
arXiv Detail & Related papers (2023-05-31T23:45:29Z) - Neighbor Based Enhancement for the Long-Tail Ranking Problem in Video
Rank Models [0.0]
We propose a novel neighbor enhancement structure to help train the representation of the target user or item.
Experiments on the well-known public dataset MovieLens 1M demonstrate the efficiency of the method.
arXiv Detail & Related papers (2023-02-16T07:38:51Z) - Latent User Intent Modeling for Sequential Recommenders [92.66888409973495]
Sequential recommender models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform.
Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online.
Intent modeling is thus critical for understanding users and optimizing long-term user experience.
arXiv Detail & Related papers (2022-11-17T19:00:24Z) - Meta-Wrapper: Differentiable Wrapping Operator for User Interest
Selection in CTR Prediction [97.99938802797377]
Click-through rate (CTR) prediction, whose goal is to predict the probability of the user to click on an item, has become increasingly significant in recommender systems.
Recent deep learning models with the ability to automatically extract the user interest from his/her behaviors have achieved great success.
We propose a novel approach under the framework of the wrapper method, which is named Meta-Wrapper.
arXiv Detail & Related papers (2022-06-28T03:28:15Z) - PinnerFormer: Sequence Modeling for User Representation at Pinterest [60.335384724891746]
We introduce PinnerFormer, a user representation trained to predict a user's future long-term engagement.
Unlike prior approaches, we adapt our modeling to a batch infrastructure via our new dense all-action loss.
We show that by doing so, we significantly close the gap between batch user embeddings that are generated once a day and realtime user embeddings generated whenever a user takes an action.
arXiv Detail & Related papers (2022-05-09T18:26:51Z) - Modeling Dynamic User Preference via Dictionary Learning for Sequential
Recommendation [133.8758914874593]
Capturing the dynamics in user preference is crucial to better predict user future behaviors because user preferences often drift over time.
Many existing recommendation algorithms -- including both shallow and deep ones -- often model such dynamics independently.
This paper considers the problem of embedding a user's sequential behavior into the latent space of user preferences.
arXiv Detail & Related papers (2022-04-02T03:23:46Z) - Sequence Adaptation via Reinforcement Learning in Recommender Systems [8.909115457491522]
We propose the SAR model, which learns the sequential patterns and adjusts the sequence length of user-item interactions in a personalized manner.
In addition, we optimize a joint loss function to align the accuracy of the sequential recommendations with the expected cumulative rewards of the critic network.
Our experimental evaluation on four real-world datasets demonstrates the superiority of our proposed model over several baseline approaches.
arXiv Detail & Related papers (2021-07-31T13:56:46Z) - Dynamic Graph Collaborative Filtering [64.87765663208927]
Dynamic recommendation is essential for recommender systems to provide real-time predictions based on sequential data.
Here we propose Dynamic Graph Collaborative Filtering (DGCF), a novel framework leveraging dynamic graphs to capture collaborative and sequential relations.
Our approach achieves higher performance when the dataset contains less action repetition, indicating the effectiveness of integrating dynamic collaborative information.
arXiv Detail & Related papers (2021-01-08T04:16:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.