UserBERT: Modeling Long- and Short-Term User Preferences via
Self-Supervision
- URL: http://arxiv.org/abs/2202.07605v1
- Date: Mon, 14 Feb 2022 08:31:36 GMT
- Title: UserBERT: Modeling Long- and Short-Term User Preferences via
Self-Supervision
- Authors: Tianyu Li, Ali Cevahir, Derek Cho, Hao Gong, DuyKhuong Nguyen, Bjorn
Stenger
- Abstract summary: This paper extends the BERT model to e-commerce user data for pre-training representations in a self-supervised manner.
By viewing user actions in sequences as analogous to words in sentences, we extend the existing BERT model to user behavior data.
We propose methods for the tokenization of different types of user behavior sequences, the generation of input representation, and a novel pretext task to enable the pre-trained model to learn from its own input.
- Score: 6.8904125699168075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: E-commerce platforms generate vast amounts of customer behavior data, such as
clicks and purchases, from millions of unique users every day. However,
effectively using this data for behavior understanding tasks is challenging
because there are usually not enough labels to learn from all users in a
supervised manner. This paper extends the BERT model to e-commerce user data
for pre-training representations in a self-supervised manner. By viewing user
actions in sequences as analogous to words in sentences, we extend the existing
BERT model to user behavior data. Further, our model adopts a unified structure
to simultaneously learn from long-term and short-term user behavior, as well as
user attributes. We propose methods for the tokenization of different types of
user behavior sequences, the generation of input representation vectors, and a
novel pretext task to enable the pre-trained model to learn from its own input,
eliminating the need for labeled training data. Extensive experiments
demonstrate that the learned representations result in significant improvements
when transferred to three different real-world tasks, particularly compared to
task-specific modeling and multi-task representation learning
Related papers
- A Simple Baseline for Predicting Events with Auto-Regressive Tabular Transformers [70.20477771578824]
Existing approaches to event prediction include time-aware positional embeddings, learned row and field encodings, and oversampling methods for addressing class imbalance.
We propose a simple but flexible baseline using standard autoregressive LLM-style transformers with elementary positional embeddings and a causal language modeling objective.
Our baseline outperforms existing approaches across popular datasets and can be employed for various use-cases.
arXiv Detail & Related papers (2024-10-14T15:59:16Z) - USE: Dynamic User Modeling with Stateful Sequence Models [26.74966828348815]
User Stateful Embedding (USE) generates user embeddings without the need for exhaustive reprocessing.
We introduce a novel training objective named future W-behavior prediction to transcend the limitations of next-token prediction.
We conduct experiments on 8 downstream tasks using Snapchat users' behavioral logs in both static (i.e., fixed user behavior sequences) and dynamic (i.e. periodically updated user behavior sequences) settings.
arXiv Detail & Related papers (2024-03-20T07:05:19Z) - PUNR: Pre-training with User Behavior Modeling for News Recommendation [26.349183393252115]
News recommendation aims to predict click behaviors based on user behaviors.
How to effectively model the user representations is the key to recommending preferred news.
We propose an unsupervised pre-training paradigm with two tasks, i.e. user behavior masking and user behavior generation.
arXiv Detail & Related papers (2023-04-25T08:03:52Z) - Latent User Intent Modeling for Sequential Recommenders [92.66888409973495]
Sequential recommender models learn to predict the next items a user is likely to interact with based on his/her interaction history on the platform.
Most sequential recommenders however lack a higher-level understanding of user intents, which often drive user behaviors online.
Intent modeling is thus critical for understanding users and optimizing long-term user experience.
arXiv Detail & Related papers (2022-11-17T19:00:24Z) - Learning Large-scale Universal User Representation with Sparse Mixture
of Experts [1.2722697496405464]
We propose SUPERMOE, a generic framework to obtain high quality user representation from multiple tasks.
Specifically, the user behaviour sequences are encoded by MoE transformer, and we can thus increase the model capacity to billions of parameters.
In order to deal with seesaw phenomenon when learning across multiple tasks, we design a new loss function with task indicators.
arXiv Detail & Related papers (2022-07-11T06:19:03Z) - Multi-Behavior Sequential Recommendation with Temporal Graph Transformer [66.10169268762014]
We tackle the dynamic user-item relation learning with the awareness of multi-behavior interactive patterns.
We propose a new Temporal Graph Transformer (TGT) recommendation framework to jointly capture dynamic short-term and long-range user-item interactive patterns.
arXiv Detail & Related papers (2022-06-06T15:42:54Z) - PinnerFormer: Sequence Modeling for User Representation at Pinterest [60.335384724891746]
We introduce PinnerFormer, a user representation trained to predict a user's future long-term engagement.
Unlike prior approaches, we adapt our modeling to a batch infrastructure via our new dense all-action loss.
We show that by doing so, we significantly close the gap between batch user embeddings that are generated once a day and realtime user embeddings generated whenever a user takes an action.
arXiv Detail & Related papers (2022-05-09T18:26:51Z) - Incremental user embedding modeling for personalized text classification [12.381095398791352]
Individual user profiles and interaction histories play a significant role in providing customized experiences in real-world applications.
We propose an incremental user embedding modeling approach, in which embeddings of user's recent interaction histories are dynamically integrated into the accumulated history vectors.
We demonstrate the effectiveness of this approach by applying it to a personalized multi-class classification task based on the Reddit dataset.
arXiv Detail & Related papers (2022-02-13T17:33:35Z) - Knowledge-Enhanced Hierarchical Graph Transformer Network for
Multi-Behavior Recommendation [56.12499090935242]
This work proposes a Knowledge-Enhanced Hierarchical Graph Transformer Network (KHGT) to investigate multi-typed interactive patterns between users and items in recommender systems.
KHGT is built upon a graph-structured neural architecture to capture type-specific behavior characteristics.
We show that KHGT consistently outperforms many state-of-the-art recommendation methods across various evaluation settings.
arXiv Detail & Related papers (2021-10-08T09:44:00Z) - Multi-Task Self-Training for Learning General Representations [97.01728635294879]
Multi-task self-training (MuST) harnesses the knowledge in independent specialized teacher models to train a single general student model.
MuST is scalable with unlabeled or partially labeled datasets and outperforms both specialized supervised models and self-supervised models when training on large scale datasets.
arXiv Detail & Related papers (2021-08-25T17:20:50Z) - Exploiting Behavioral Consistence for Universal User Representation [11.290137806288191]
We focus on developing universal user representation model.
The obtained universal representations are expected to contain rich information.
We propose Self-supervised User Modeling Network (SUMN) to encode behavior data into the universal representation.
arXiv Detail & Related papers (2020-12-11T06:10:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.