Related papers: Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction

Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction

URL: http://arxiv.org/abs/2006.05639v2
Date: Mon, 29 Jun 2020 03:27:18 GMT
Title: Search-based User Interest Modeling with Lifelong Sequential Behavior Data for Click-Through Rate Prediction
Authors: Pi Qi, Xiaoqiang Zhu, Guorui Zhou, Yujing Zhang, Zhe Wang, Lejian Ren, Ying Fan, and Kun Gai
Abstract summary: We propose a new modeling paradigm, which we name as Search-based Interest Model (SIM) SIM extracts user interests with two cascaded search units. Since 2019, SIM has been deployed in the display advertising system in Alibaba, bringing 7.1% CTR and 4.4% lift.
Score: 23.460147230576855
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Rich user behavior data has been proven to be of great value for click-through rate prediction tasks, especially in industrial applications such as recommender systems and online advertising. Both industry and academy have paid much attention to this topic and propose different approaches to modeling with long sequential user behavior data. Among them, memory network based model MIMN proposed by Alibaba, achieves SOTA with the co-design of both learning algorithm and serving system. MIMN is the first industrial solution that can model sequential user behavior data with length scaling up to 1000. However, MIMN fails to precisely capture user interests given a specific candidate item when the length of user behavior sequence increases further, say, by 10 times or more. This challenge exists widely in previously proposed approaches. In this paper, we tackle this problem by designing a new modeling paradigm, which we name as Search-based Interest Model (SIM). SIM extracts user interests with two cascaded search units: (i) General Search Unit acts as a general search from the raw and arbitrary long sequential behavior data, with query information from candidate item, and gets a Sub user Behavior Sequence which is relevant to candidate item; (ii) Exact Search Unit models the precise relationship between candidate item and SBS. This cascaded search paradigm enables SIM with a better ability to model lifelong sequential behavior data in both scalability and accuracy. Apart from the learning algorithm, we also introduce our hands-on experience on how to implement SIM in large scale industrial systems. Since 2019, SIM has been deployed in the display advertising system in Alibaba, bringing 7.1\% CTR and 4.4\% RPM lift, which is significant to the business. Serving the main traffic in our real system now, SIM models user behavior data with maximum length reaching up to 54000, pushing SOTA to 54x.

Related papers

User Long-Term Multi-Interest Retrieval Model for Recommendation [20.2928687653132]
We propose a new framework named User Long-term Multi-Interest Retrieval Model(ULIM), which enables thousand-scale behavior modeling in retrieval stages.<n>We show that ULIM achieves substantial improvement over state-of-the-art methods, and brings 5.54% clicks, 11.01% orders and 4.03% GMV lift for Taobaomiaosha, a notable mini-app of Taobao.
arXiv Detail & Related papers (2025-07-14T09:32:26Z)
CRM: Retrieval Model with Controllable Condition [23.936944737868465]
Controllable Retrieval Model integrates regression information as conditional features into the two-tower retrieval paradigm. We validate the effectiveness of CRM through real-world A/B testing and demonstrate its successful deployment in Kuaishou short-video recommendation system.
arXiv Detail & Related papers (2024-12-18T13:37:36Z)
Multi-granularity Interest Retrieval and Refinement Network for Long-Term User Behavior Modeling in CTR Prediction [68.90783662117936]
Click-through Rate (CTR) prediction is crucial for online personalization platforms. Recent advancements have shown that modeling rich user behaviors can significantly improve the performance of CTR prediction. We propose Multi-granularity Interest Retrieval and Refinement Network (MIRRN)
arXiv Detail & Related papers (2024-11-22T15:29:05Z)
SEMINAR: Search Enhanced Multi-modal Interest Network and Approximate Retrieval for Lifelong Sequential Recommendation [16.370075234443245]
We propose a unified lifelong multi-modal sequence model called SEMINAR-Search Enhanced Multi-Modal Interest Network and Approximate Retrieval. Specifically, a network called Pretraining Search Unit learns the lifelong sequences of multi-modal query-item pairs in a pretraining-finetuning manner. To accelerate the online retrieval speed of multi-modal embedding, we propose a multi-modal codebook-based product quantization strategy.
arXiv Detail & Related papers (2024-07-15T13:33:30Z)
BASES: Large-scale Web Search User Simulation with Large Language Model based Agents [108.97507653131917]
BASES is a novel user simulation framework with large language models (LLMs) Our simulation framework can generate unique user profiles at scale, which subsequently leads to diverse search behaviors. WARRIORS is a new large-scale dataset encompassing web search user behaviors, including both Chinese and English versions.
arXiv Detail & Related papers (2024-02-27T13:44:09Z)
Recommender Systems with Generative Retrieval [58.454606442670034]
We propose a novel generative retrieval approach, where the retrieval model autoregressively decodes the identifiers of the target candidates. To that end, we create semantically meaningful of codewords to serve as a Semantic ID for each item. We show that recommender systems trained with the proposed paradigm significantly outperform the current SOTA models on various datasets.
arXiv Detail & Related papers (2023-05-08T21:48:17Z)
Neighbor Based Enhancement for the Long-Tail Ranking Problem in Video Rank Models [0.0]
We propose a novel neighbor enhancement structure to help train the representation of the target user or item. Experiments on the well-known public dataset MovieLens 1M demonstrate the efficiency of the method.
arXiv Detail & Related papers (2023-02-16T07:38:51Z)
Federated Privacy-preserving Collaborative Filtering for On-Device Next App Prediction [52.16923290335873]
We propose a novel SeqMF model to solve the problem of predicting the next app launch during mobile device usage. We modify the structure of the classical matrix factorization model and update the training procedure to sequential learning. One more ingredient of the proposed approach is a new privacy mechanism that guarantees the protection of the sent data from the users to the remote server.
arXiv Detail & Related papers (2023-02-05T10:29:57Z)
Sampling Is All You Need on Modeling Long-Term User Behaviors for CTR Prediction [15.97120392599086]
We propose textbfM (textbfSampling-based textbfDeep textbfModeling), a simple yet effective sampling-based end-to-end approach for modeling long-term user behaviors. We show theoretically and experimentally that the proposed method performs on par with standard attention-based models on modeling long-term user behaviors.
arXiv Detail & Related papers (2022-05-20T15:20:52Z)
Modeling Dynamic User Preference via Dictionary Learning for Sequential Recommendation [133.8758914874593]
Capturing the dynamics in user preference is crucial to better predict user future behaviors because user preferences often drift over time. Many existing recommendation algorithms -- including both shallow and deep ones -- often model such dynamics independently. This paper considers the problem of embedding a user's sequential behavior into the latent space of user preferences.
arXiv Detail & Related papers (2022-04-02T03:23:46Z)
Sequential Search with Off-Policy Reinforcement Learning [48.88165680363482]
We propose a highly scalable hybrid learning model that consists of an RNN learning framework and an attention model. As a novel optimization step, we fit multiple short user sequences in a single RNN pass within a training batch, by solving a greedy knapsack problem on the fly. We also explore the use of off-policy reinforcement learning in multi-session personalized search ranking.
arXiv Detail & Related papers (2022-02-01T06:52:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.