Related papers: DualGR: Generative Retrieval with Long and Short-Term Interests Modeling

DualGR: Generative Retrieval with Long and Short-Term Interests Modeling

URL: http://arxiv.org/abs/2511.12518v1
Date: Sun, 16 Nov 2025 09:20:54 GMT
Title: DualGR: Generative Retrieval with Long and Short-Term Interests Modeling
Authors: Zhongchao Yi, Kai Feng, Xiaojian Ma, Yalong Wang, Yongqi Liu, Han Li, Zhengyang Zhou, Yang Wang,
Abstract summary: Generative Retrieval (GR) has emerged as a viable alternative to Embedding-Based Retrieval (EBR)<n>We propose DualGR, a generative retrieval framework that explicitly models dual horizons of user interests with selective activation.<n>Online A/B testing shows +0.527% video views and +0.432% watch time lifts, validating DualGR as a practical and effective paradigm for industrial generative retrieval.
Score: 23.123644321765607
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In large-scale industrial recommendation systems, retrieval must produce high-quality candidates from massive corpora under strict latency. Recently, Generative Retrieval (GR) has emerged as a viable alternative to Embedding-Based Retrieval (EBR), which quantizes items into a finite token space and decodes candidates autoregressively, providing a scalable path that explicitly models target-history interactions via cross-attention. However, three challenges persist: 1) how to balance users' long-term and short-term interests , 2) noise interference when generating hierarchical semantic IDs (SIDs), 3) the absence of explicit modeling for negative feedback such as exposed items without clicks. To address these challenges, we propose DualGR, a generative retrieval framework that explicitly models dual horizons of user interests with selective activation. Specifically, DualGR utilizes Dual-Branch Long/Short-Term Router (DBR) to cover both stable preferences and transient intents by explicitly modeling users' long- and short-term behaviors. Meanwhile, Search-based SID Decoding (S2D) is presented to control context-induced noise and enhance computational efficiency by constraining candidate interactions to the current coarse (level-1) bucket during fine-grained (level-2/3) SID prediction. % also reinforcing intra-class consistency. Finally, we propose an Exposure-aware Next-Token Prediction Loss (ENTP-Loss) that treats "exposed-but-unclicked" items as hard negatives at level-1, enabling timely interest fade-out. On the large-scale Kuaishou short-video recommendation system, DualGR has achieved outstanding performance. Online A/B testing shows +0.527% video views and +0.432% watch time lifts, validating DualGR as a practical and effective paradigm for industrial generative retrieval.

Related papers

Beyond the Flat Sequence: Hierarchical and Preference-Aware Generative Recommendations [35.58864660038236]
We propose a novel framework named HPGR (Hierarchical and Preference-aware Generative Recommender)<n>First, a structure-aware pre-training stage employs a session-based Masked Item Modeling objective to learn a hierarchically-informed and semantically rich item representation space.<n>Second, a preference-aware fine-tuning stage leverages these powerful representations to implement a Preference-Guided Sparse Attention mechanism.
arXiv Detail & Related papers (2026-03-01T08:15:34Z)
DeepInterestGR: Mining Deep Multi-Interest Using Multi-Modal LLMs for Generative Recommendation [0.0]
DeepInterestGR introduces three key innovations in generative recommendation framework.<n>We leverage multi-LLM Interest Mining, Reward-Labeled Deep Interest, and Interest-Enhanced Item Discretization.<n> Experiments on three Amazon Review benchmarks demonstrate that DeepInterestGR consistently outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2026-02-21T17:03:06Z)
GEMs: Breaking the Long-Sequence Barrier in Generative Recommendation with a Multi-Stream Decoder [54.64137490632567]
We propose a novel and unified framework designed to capture users' sequences from long-term history.<n>Generative Multi-streamers ( GEMs) break user sequences into three streams.<n>Extensive experiments on large-scale industrial datasets demonstrate that GEMs significantly outperforms state-the-art methods in recommendation accuracy.
arXiv Detail & Related papers (2026-02-14T06:42:56Z)
Climber-Pilot: A Non-Myopic Generative Recommendation Model Towards Better Instruction-Following [19.550149895505683]
We present Climber-Pilot, a unified generative retrieval framework.<n>We introduce Time-Aware Multi-Item Prediction (TAMIP), a novel training paradigm designed to mitigate inherent myopia in generative retrieval.<n>We also propose Condition-Guided Sparse Attention (CGSA), which incorporates business constraints directly into the generative process via sparse attention.
arXiv Detail & Related papers (2026-02-14T03:46:06Z)
R2LED: Equipping Retrieval and Refinement in Lifelong User Modeling with Semantic IDs for CTR Prediction [23.668401664583758]
We propose a novel paradigm that equips retrieval and refinement in Lifelong User Modeling with SEmantic IDs (R2LED)<n>First, we introduce a Multi-route Mixed Retrieval for the retrieval stage. On the other hand, a mixed retrieval mechanism is proposed to efficiently retrieve candidates from both collaborative and semantic views.<n>For refinement, we design a Bi-level Fusion Refinement, including a target-aware cross-attention for route-level fusion and a gate mechanism for SID-level fusion.
arXiv Detail & Related papers (2026-02-06T11:27:20Z)
GLASS: A Generative Recommender for Long-sequence Modeling via SID-Tier and Semantic Search [51.44490997013772]
GLASS is a novel framework that integrates long-term user interests into the generative process via SID-Tier and Semantic Search.<n>We show that GLASS outperforms state-of-the-art baselines in experiments on two large-scale real-world datasets.
arXiv Detail & Related papers (2026-02-05T13:48:33Z)
Controllable Graph Generation with Diffusion Models via Inference-Time Tree Search Guidance [36.29334590991777]
Graph generation is a fundamental problem in graph learning with broad applications across Web-scale systems, knowledge graphs, and scientific domains such as drug and material discovery.<n>Recent approaches leverage diffusion models for step-by-step generation, yet unconditional diffusion offers little control over desired properties, often leading to unstable quality and difficulty in incorporating new objectives.<n>Inference-time guidance methods mitigate these issues by adjusting the sampling process without retraining, but they remain inherently local, and limited in controllability.<n>We propose TreeDiff, a Monte Carlo Tree Search (MCTS) guided dual-space diffusion framework for controllable graph generation.
arXiv Detail & Related papers (2025-10-12T01:40:33Z)
Modeling Long-term User Behaviors with Diffusion-driven Multi-interest Network for CTR Prediction [18.302602011055775]
We propose DiffuMIN (Diffusion-driven Multi-Interest Network) to model long-term user behaviors.<n>We show that DiffuMIN increased CTR by 1.52% and CPM by 1.10% in online A/B testing.
arXiv Detail & Related papers (2025-08-21T07:10:01Z)
ENCODE: Breaking the Trade-Off Between Performance and Efficiency in Long-Term User Behavior Modeling [12.963611514800656]
We propose an efficient two-stage long-term sequence modeling approach, named as EfficieNt Clustering based twO-stage interest moDEling (ENCODE)<n>In the offline extraction stage, ENCODE clusters the entire behavior sequence and extracts accurate interests.<n>While in the online inference stage, ENCODE takes the off-the-shelf user interests to predict the associations with target items.
arXiv Detail & Related papers (2025-08-19T06:58:21Z)
VISTA: Unsupervised 2D Temporal Dependency Representations for Time Series Anomaly Detection [42.694234312755285]
Time Series Anomaly Detection (TSAD) is essential for uncovering rare and potentially harmful events in unlabeled time series data.<n>We introduce VISTA, a training-free, unsupervised TSAD algorithm designed to overcome these challenges.
arXiv Detail & Related papers (2025-04-03T11:20:49Z)
Multi-granularity Interest Retrieval and Refinement Network for Long-Term User Behavior Modeling in CTR Prediction [68.90783662117936]
Click-through Rate (CTR) prediction is crucial for online personalization platforms.<n>Recent advancements have shown that modeling rich user behaviors can significantly improve the performance of CTR prediction.<n>We propose Multi-granularity Interest Retrieval and Refinement Network (MIRRN)
arXiv Detail & Related papers (2024-11-22T15:29:05Z)
Long-Sequence Recommendation Models Need Decoupled Embeddings [49.410906935283585]
We identify and characterize a neglected deficiency in existing long-sequence recommendation models.<n>A single set of embeddings struggles with learning both attention and representation, leading to interference between these two processes.<n>We propose the Decoupled Attention and Representation Embeddings (DARE) model, where two distinct embedding tables are learned separately to fully decouple attention and representation.
arXiv Detail & Related papers (2024-10-03T15:45:15Z)
Diffusion Recommender Model [85.9640416600725]
We propose a novel Diffusion Recommender Model (named DiffRec) to learn the generative process in a denoising manner.<n>To retain personalized information in user interactions, DiffRec reduces the added noises and avoids corrupting users' interactions into pure noises like in image synthesis.
arXiv Detail & Related papers (2023-04-11T04:31:00Z)
BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation [85.13713217986738]
We present BSN++, a new framework which exploits complementary boundary regressor and relation modeling for temporal proposal generation. Not surprisingly, the proposed BSN++ ranked 1st place in the CVPR19 - ActivityNet challenge leaderboard on temporal action localization task.
arXiv Detail & Related papers (2020-09-15T07:08:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.