DualGR: Generative Retrieval with Long and Short-Term Interests Modeling
- URL: http://arxiv.org/abs/2511.12518v1
- Date: Sun, 16 Nov 2025 09:20:54 GMT
- Title: DualGR: Generative Retrieval with Long and Short-Term Interests Modeling
- Authors: Zhongchao Yi, Kai Feng, Xiaojian Ma, Yalong Wang, Yongqi Liu, Han Li, Zhengyang Zhou, Yang Wang,
- Abstract summary: Generative Retrieval (GR) has emerged as a viable alternative to Embedding-Based Retrieval (EBR)<n>We propose DualGR, a generative retrieval framework that explicitly models dual horizons of user interests with selective activation.<n>Online A/B testing shows +0.527% video views and +0.432% watch time lifts, validating DualGR as a practical and effective paradigm for industrial generative retrieval.
- Score: 23.123644321765607
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In large-scale industrial recommendation systems, retrieval must produce high-quality candidates from massive corpora under strict latency. Recently, Generative Retrieval (GR) has emerged as a viable alternative to Embedding-Based Retrieval (EBR), which quantizes items into a finite token space and decodes candidates autoregressively, providing a scalable path that explicitly models target-history interactions via cross-attention. However, three challenges persist: 1) how to balance users' long-term and short-term interests , 2) noise interference when generating hierarchical semantic IDs (SIDs), 3) the absence of explicit modeling for negative feedback such as exposed items without clicks. To address these challenges, we propose DualGR, a generative retrieval framework that explicitly models dual horizons of user interests with selective activation. Specifically, DualGR utilizes Dual-Branch Long/Short-Term Router (DBR) to cover both stable preferences and transient intents by explicitly modeling users' long- and short-term behaviors. Meanwhile, Search-based SID Decoding (S2D) is presented to control context-induced noise and enhance computational efficiency by constraining candidate interactions to the current coarse (level-1) bucket during fine-grained (level-2/3) SID prediction. % also reinforcing intra-class consistency. Finally, we propose an Exposure-aware Next-Token Prediction Loss (ENTP-Loss) that treats "exposed-but-unclicked" items as hard negatives at level-1, enabling timely interest fade-out. On the large-scale Kuaishou short-video recommendation system, DualGR has achieved outstanding performance. Online A/B testing shows +0.527% video views and +0.432% watch time lifts, validating DualGR as a practical and effective paradigm for industrial generative retrieval.
Related papers
- Beyond the Flat Sequence: Hierarchical and Preference-Aware Generative Recommendations [35.58864660038236]
We propose a novel framework named HPGR (Hierarchical and Preference-aware Generative Recommender)<n>First, a structure-aware pre-training stage employs a session-based Masked Item Modeling objective to learn a hierarchically-informed and semantically rich item representation space.<n>Second, a preference-aware fine-tuning stage leverages these powerful representations to implement a Preference-Guided Sparse Attention mechanism.
arXiv Detail & Related papers (2026-03-01T08:15:34Z) - DeepInterestGR: Mining Deep Multi-Interest Using Multi-Modal LLMs for Generative Recommendation [0.0]
DeepInterestGR introduces three key innovations in generative recommendation framework.<n>We leverage multi-LLM Interest Mining, Reward-Labeled Deep Interest, and Interest-Enhanced Item Discretization.<n> Experiments on three Amazon Review benchmarks demonstrate that DeepInterestGR consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2026-02-21T17:03:06Z) - GEMs: Breaking the Long-Sequence Barrier in Generative Recommendation with a Multi-Stream Decoder [54.64137490632567]
We propose a novel and unified framework designed to capture users' sequences from long-term history.<n>Generative Multi-streamers ( GEMs) break user sequences into three streams.<n>Extensive experiments on large-scale industrial datasets demonstrate that GEMs significantly outperforms state-the-art methods in recommendation accuracy.
arXiv Detail & Related papers (2026-02-14T06:42:56Z) - Climber-Pilot: A Non-Myopic Generative Recommendation Model Towards Better Instruction-Following [19.550149895505683]
We present Climber-Pilot, a unified generative retrieval framework.<n>We introduce Time-Aware Multi-Item Prediction (TAMIP), a novel training paradigm designed to mitigate inherent myopia in generative retrieval.<n>We also propose Condition-Guided Sparse Attention (CGSA), which incorporates business constraints directly into the generative process via sparse attention.
arXiv Detail & Related papers (2026-02-14T03:46:06Z) - R2LED: Equipping Retrieval and Refinement in Lifelong User Modeling with Semantic IDs for CTR Prediction [23.668401664583758]
We propose a novel paradigm that equips retrieval and refinement in Lifelong User Modeling with SEmantic IDs (R2LED)<n>First, we introduce a Multi-route Mixed Retrieval for the retrieval stage. On the other hand, a mixed retrieval mechanism is proposed to efficiently retrieve candidates from both collaborative and semantic views.<n>For refinement, we design a Bi-level Fusion Refinement, including a target-aware cross-attention for route-level fusion and a gate mechanism for SID-level fusion.
arXiv Detail & Related papers (2026-02-06T11:27:20Z) - GLASS: A Generative Recommender for Long-sequence Modeling via SID-Tier and Semantic Search [51.44490997013772]
GLASS is a novel framework that integrates long-term user interests into the generative process via SID-Tier and Semantic Search.<n>We show that GLASS outperforms state-of-the-art baselines in experiments on two large-scale real-world datasets.
arXiv Detail & Related papers (2026-02-05T13:48:33Z) - Controllable Graph Generation with Diffusion Models via Inference-Time Tree Search Guidance [36.29334590991777]
Graph generation is a fundamental problem in graph learning with broad applications across Web-scale systems, knowledge graphs, and scientific domains such as drug and material discovery.<n>Recent approaches leverage diffusion models for step-by-step generation, yet unconditional diffusion offers little control over desired properties, often leading to unstable quality and difficulty in incorporating new objectives.<n>Inference-time guidance methods mitigate these issues by adjusting the sampling process without retraining, but they remain inherently local, and limited in controllability.<n>We propose TreeDiff, a Monte Carlo Tree Search (MCTS) guided dual-space diffusion framework for controllable graph generation.
arXiv Detail & Related papers (2025-10-12T01:40:33Z) - Modeling Long-term User Behaviors with Diffusion-driven Multi-interest Network for CTR Prediction [18.302602011055775]
We propose DiffuMIN (Diffusion-driven Multi-Interest Network) to model long-term user behaviors.<n>We show that DiffuMIN increased CTR by 1.52% and CPM by 1.10% in online A/B testing.
arXiv Detail & Related papers (2025-08-21T07:10:01Z) - ENCODE: Breaking the Trade-Off Between Performance and Efficiency in Long-Term User Behavior Modeling [12.963611514800656]
We propose an efficient two-stage long-term sequence modeling approach, named as EfficieNt Clustering based twO-stage interest moDEling (ENCODE)<n>In the offline extraction stage, ENCODE clusters the entire behavior sequence and extracts accurate interests.<n>While in the online inference stage, ENCODE takes the off-the-shelf user interests to predict the associations with target items.
arXiv Detail & Related papers (2025-08-19T06:58:21Z) - VISTA: Unsupervised 2D Temporal Dependency Representations for Time Series Anomaly Detection [42.694234312755285]
Time Series Anomaly Detection (TSAD) is essential for uncovering rare and potentially harmful events in unlabeled time series data.<n>We introduce VISTA, a training-free, unsupervised TSAD algorithm designed to overcome these challenges.
arXiv Detail & Related papers (2025-04-03T11:20:49Z) - Multi-granularity Interest Retrieval and Refinement Network for Long-Term User Behavior Modeling in CTR Prediction [68.90783662117936]
Click-through Rate (CTR) prediction is crucial for online personalization platforms.<n>Recent advancements have shown that modeling rich user behaviors can significantly improve the performance of CTR prediction.<n>We propose Multi-granularity Interest Retrieval and Refinement Network (MIRRN)
arXiv Detail & Related papers (2024-11-22T15:29:05Z) - Long-Sequence Recommendation Models Need Decoupled Embeddings [49.410906935283585]
We identify and characterize a neglected deficiency in existing long-sequence recommendation models.<n>A single set of embeddings struggles with learning both attention and representation, leading to interference between these two processes.<n>We propose the Decoupled Attention and Representation Embeddings (DARE) model, where two distinct embedding tables are learned separately to fully decouple attention and representation.
arXiv Detail & Related papers (2024-10-03T15:45:15Z) - Diffusion Recommender Model [85.9640416600725]
We propose a novel Diffusion Recommender Model (named DiffRec) to learn the generative process in a denoising manner.<n>To retain personalized information in user interactions, DiffRec reduces the added noises and avoids corrupting users' interactions into pure noises like in image synthesis.
arXiv Detail & Related papers (2023-04-11T04:31:00Z) - BSN++: Complementary Boundary Regressor with Scale-Balanced Relation
Modeling for Temporal Action Proposal Generation [85.13713217986738]
We present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation.
Not surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet challenge leaderboard on temporal action localization task.
arXiv Detail & Related papers (2020-09-15T07:08:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.