Related papers: Generative Sequential Recommendation via Hierarchical Behavior Modeling

Generative Sequential Recommendation via Hierarchical Behavior Modeling

URL: http://arxiv.org/abs/2511.03155v1
Date: Wed, 05 Nov 2025 03:27:01 GMT
Title: Generative Sequential Recommendation via Hierarchical Behavior Modeling
Authors: Zhefan Wang, Guokai Yan, Jinbei Yu, Siyu Gu, Jingyan Chen, Peng Jiang, Zhiqiang Guo, Min Zhang,
Abstract summary: We propose a novel generative framework, GAMER, built upon a decoder-only backbone.<n> GAMER introduces a cross-level interaction layer to capture hierarchical dependencies among behaviors.<n>ShortVideoAD is a large-scale multi-behavior dataset from a mainstream short-video platform.
Score: 20.156854767000475
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recommender systems in multi-behavior domains, such as advertising and e-commerce, aim to guide users toward high-value but inherently sparse conversions. Leveraging auxiliary behaviors (e.g., clicks, likes, shares) is therefore essential. Recent progress on generative recommendations has brought new possibilities for multi-behavior sequential recommendation. However, existing generative approaches face two significant challenges: 1) Inadequate Sequence Modeling: capture the complex, cross-level dependencies within user behavior sequences, and 2) Lack of Suitable Datasets: publicly available multi-behavior recommendation datasets are almost exclusively derived from e-commerce platforms, limiting the validation of feasibility in other domains, while also lacking sufficient side information for semantic ID generation. To address these issues, we propose a novel generative framework, GAMER (Generative Augmentation and Multi-lEvel behavior modeling for Recommendation), built upon a decoder-only backbone. GAMER introduces a cross-level interaction layer to capture hierarchical dependencies among behaviors and a sequential augmentation strategy that enhances robustness in training. To further advance this direction, we collect and release ShortVideoAD, a large-scale multi-behavior dataset from a mainstream short-video platform, which differs fundamentally from existing e-commerce datasets and provides pretrained semantic IDs for research on generative methods. Extensive experiments show that GAMER consistently outperforms both discriminative and generative baselines across multiple metrics.

Related papers

GLASS: A Generative Recommender for Long-sequence Modeling via SID-Tier and Semantic Search [51.44490997013772]
GLASS is a novel framework that integrates long-term user interests into the generative process via SID-Tier and Semantic Search.<n>We show that GLASS outperforms state-of-the-art baselines in experiments on two large-scale real-world datasets.
arXiv Detail & Related papers (2026-02-05T13:48:33Z)
Multimodal Generative Recommendation for Fusing Semantic and Collaborative Signals [17.608491612845306]
Sequential recommender systems rank relevant items by modeling a user's interaction history and computing the inner product between the resulting user representation and stored item embeddings.<n>To avoid the significant memory overhead of storing large item sets, the generative recommendation paradigm instead models each item as a series of discrete semantic codes.<n>These methods have yet to surpass traditional sequential recommenders on large item sets, limiting their adoption in the very scenarios they were designed to address.<n>We propose MSCGRec, a Multimodal Semantic and Collaborative Generative Recommender.
arXiv Detail & Related papers (2026-02-03T16:39:35Z)
BLADE: A Behavior-Level Data Augmentation Framework with Dual Fusion Modeling for Multi-Behavior Sequential Recommendation [15.457239237638985]
BLADE is a framework that enhances multi-behavior modeling while mitigating data sparsity.<n>We introduce a dual item-behavior fusion architecture that incorporates behavior information at both the input and intermediate levels.<n>Three behavior-level data augmentation methods operate directly on behavior sequences rather than core item sequences.
arXiv Detail & Related papers (2025-12-15T04:02:53Z)
Multi-Aspect Cross-modal Quantization for Generative Recommendation [27.92632297542123]
We propose Multi-Aspect Cross-modal quantization for generative Recommendation (MACRec)<n>We first introduce cross-modal quantization during the ID learning process, which effectively reduces conflict rates.<n>We also incorporate multi-aspect cross-modal alignments, including the implicit and explicit alignments.
arXiv Detail & Related papers (2025-11-19T04:55:14Z)
Structurally Refined Graph Transformer for Multimodal Recommendation [13.296555757708298]
We present SRGFormer, a structurally optimized multimodal recommendation model.<n>By modifying the transformer for better integration into our model, we capture the overall behavior patterns of users.<n>Then, we enhance structural information by embedding multimodal information into a hypergraph structure to aid in learning the local structures between users and items.
arXiv Detail & Related papers (2025-11-01T15:18:00Z)
Merge and Guide: Unifying Model Merging and Guided Decoding for Controllable Multi-Objective Generation [49.98025799046136]
We introduce Merge-And-GuidE, a two-stage framework that leverages model merging for guided decoding.<n>In Stage 1, MAGE resolves a compatibility problem between the guidance and base models.<n>In Stage 2, we merge explicit and implicit value models into a unified guidance proxy, which then steers the decoding of the base model from Stage 1.
arXiv Detail & Related papers (2025-10-04T11:10:07Z)
Multi-Grained Preference Enhanced Transformer for Multi-Behavior Sequential Recommendation [29.97854124851886]
Sequential recommendation aims to predict the next purchasing item according to users' dynamic preference learned from their historical user-item interactions.<n>Existing methods only model heterogeneous multi-behavior dependencies at behavior-level or item-level, and modelling interaction-level dependencies is still a challenge.<n>We propose a Multi-Grained Preference enhanced Transformer framework (M-GPT) to tackle these challenges.
arXiv Detail & Related papers (2024-11-19T02:45:17Z)
Unleash LLMs Potential for Recommendation by Coordinating Twin-Tower Dynamic Semantic Token Generator [60.07198935747619]
We propose Twin-Tower Dynamic Semantic Recommender (T TDS), the first generative RS which adopts dynamic semantic index paradigm. To be more specific, we for the first time contrive a dynamic knowledge fusion framework which integrates a twin-tower semantic token generator into the LLM-based recommender. The proposed T TDS recommender achieves an average improvement of 19.41% in Hit-Rate and 20.84% in NDCG metric, compared with the leading baseline methods.
arXiv Detail & Related papers (2024-09-14T01:45:04Z)
Multi-Behavior Hypergraph-Enhanced Transformer for Sequential Recommendation [33.97708796846252]
We introduce a new Multi-Behavior Hypergraph-enhanced Transformer framework (MBHT) to capture both short-term and long-term cross-type behavior dependencies. Specifically, a multi-scale Transformer is equipped with low-rank self-attention to jointly encode behavior-aware sequential patterns from fine-grained and coarse-grained levels.
arXiv Detail & Related papers (2022-07-12T15:07:21Z)
Multi-Behavior Sequential Recommendation with Temporal Graph Transformer [66.10169268762014]
We tackle the dynamic user-item relation learning with the awareness of multi-behavior interactive patterns. We propose a new Temporal Graph Transformer (TGT) recommendation framework to jointly capture dynamic short-term and long-range user-item interactive patterns.
arXiv Detail & Related papers (2022-06-06T15:42:54Z)
Multiplex Behavioral Relation Learning for Recommendation via Memory Augmented Transformer Network [25.563806871858073]
This work proposes a Memory-Augmented Transformer Networks (MATN) to enable the recommendation with multiplex behavioral relational information. In our MATN framework, we first develop a transformer-based multi-behavior relation encoder, to make the learned interaction representations be reflective of the cross-type behavior relations. A memory attention network is proposed to supercharge MATN capturing the contextual signals of different types of behavior into the category-specific latent embedding space.
arXiv Detail & Related papers (2021-10-08T09:54:43Z)
Knowledge-Enhanced Hierarchical Graph Transformer Network for Multi-Behavior Recommendation [56.12499090935242]
This work proposes a Knowledge-Enhanced Hierarchical Graph Transformer Network (KHGT) to investigate multi-typed interactive patterns between users and items in recommender systems. KHGT is built upon a graph-structured neural architecture to capture type-specific behavior characteristics. We show that KHGT consistently outperforms many state-of-the-art recommendation methods across various evaluation settings.
arXiv Detail & Related papers (2021-10-08T09:44:00Z)
Hyper Meta-Path Contrastive Learning for Multi-Behavior Recommendation [61.114580368455236]
User purchasing prediction with multi-behavior information remains a challenging problem for current recommendation systems. We propose the concept of hyper meta-path to construct hyper meta-paths or hyper meta-graphs to explicitly illustrate the dependencies among different behaviors of a user. Thanks to the recent success of graph contrastive learning, we leverage it to learn embeddings of user behavior patterns adaptively instead of assigning a fixed scheme to understand the dependencies among different behaviors.
arXiv Detail & Related papers (2021-09-07T04:28:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.